Overview

Brought to you by YData

Dataset statistics

Number of variables151
Number of observations604626
Missing cells50336207
Missing cells (%)55.1%
Total size in memory696.6 MiB
Average record size in memory1.2 KiB

Variable types

Text151

Dataset

DescriptionEntomology NMNH Extant Extant Specimen Records 0052484-241126133413365
URLhttps://doi.org/10.15468/dl.ptewed

Alerts

license has constant value "CC0_1_0" Constant
publisher has constant value "National Museum of Natural History, Smithsonian Institution" Constant
institutionID has constant value "urn:lsid:biocol.org:col:34871" Constant
collectionID has constant value "urn:uuid:18e3cd08-a962-4f0a-b72c-9a0b3600c5ad" Constant
institutionCode has constant value "USNM" Constant
collectionCode has constant value "ENT" Constant
datasetName has constant value "NMNH Extant Biology" Constant
occurrenceStatus has constant value "PRESENT" Constant
verbatimLabel has constant value "-11.7815" Constant
materialSampleID has constant value "-76.7017" Constant
verbatimDepth has constant value "220m inside cave entrance" Constant
verbatimCoordinateSystem has constant value "Degrees Minutes Seconds" Constant
verbatimSRS has constant value "1973-05-08" Constant
footprintSRS has constant value "128" Constant
footprintSpatialFit has constant value "128" Constant
georeferencedDate has constant value "5" Constant
earliestEraOrLowestErathem has constant value "Animalia" Constant
latestEraOrHighestErathem has constant value "Arthropoda" Constant
earliestPeriodOrLowestSystem has constant value "Insecta" Constant
group has constant value "Florida" Constant
formation has constant value "Pinellas" Constant
verbatimIdentification has constant value "SPECIES" Constant
identifiedByID has constant value "ACCEPTED" Constant
taxonConceptID has constant value "StillImage" Constant
acceptedNameUsage has constant value "false" Constant
nameAccordingTo has constant value "1" Constant
namePublishedIn has constant value "54" Constant
namePublishedInYear has constant value "216" Constant
subtribe has constant value "EML" Constant
subgenus has constant value "true" Constant
verbatimTaxonRank has constant value "PER" Constant
nomenclaturalCode has constant value "PER.16_1" Constant
nomenclaturalStatus has constant value "PER.16.6_1" Constant
taxonRemarks has constant value "Huarochiri" Constant
subgenusKey has constant value "Insecta" Constant
protocol has constant value "EML" Constant
projectId has constant value "roseni" Constant
isSequenced has constant value "false" Constant
catalogNumber has 233418 (38.6%) missing values Missing
recordNumber has 604589 (> 99.9%) missing values Missing
recordedBy has 203336 (33.6%) missing values Missing
sex has 384462 (63.6%) missing values Missing
lifeStage has 184129 (30.5%) missing values Missing
preparations has 42051 (7.0%) missing values Missing
occurrenceRemarks has 459276 (76.0%) missing values Missing
verbatimLabel has 604625 (> 99.9%) missing values Missing
materialSampleID has 604625 (> 99.9%) missing values Missing
fieldNumber has 600377 (99.3%) missing values Missing
eventDate has 239769 (39.7%) missing values Missing
startDayOfYear has 270965 (44.8%) missing values Missing
endDayOfYear has 270965 (44.8%) missing values Missing
year has 240229 (39.7%) missing values Missing
month has 254573 (42.1%) missing values Missing
day has 314935 (52.1%) missing values Missing
verbatimEventDate has 396306 (65.5%) missing values Missing
habitat has 604427 (> 99.9%) missing values Missing
locationID has 603581 (99.8%) missing values Missing
higherGeography has 156072 (25.8%) missing values Missing
continent has 199137 (32.9%) missing values Missing
islandGroup has 602107 (99.6%) missing values Missing
island has 595261 (98.5%) missing values Missing
countryCode has 163440 (27.0%) missing values Missing
stateProvince has 173217 (28.6%) missing values Missing
county has 254826 (42.1%) missing values Missing
locality has 158340 (26.2%) missing values Missing
verbatimElevation has 594692 (98.4%) missing values Missing
verbatimDepth has 604620 (> 99.9%) missing values Missing
minimumDistanceAboveSurfaceInMeters has 604624 (> 99.9%) missing values Missing
decimalLatitude has 285575 (47.2%) missing values Missing
decimalLongitude has 285575 (47.2%) missing values Missing
coordinateUncertaintyInMeters has 592674 (98.0%) missing values Missing
pointRadiusSpatialFit has 604624 (> 99.9%) missing values Missing
verbatimCoordinateSystem has 604625 (> 99.9%) missing values Missing
verbatimSRS has 604625 (> 99.9%) missing values Missing
footprintSRS has 604625 (> 99.9%) missing values Missing
footprintSpatialFit has 604625 (> 99.9%) missing values Missing
georeferencedBy has 604623 (> 99.9%) missing values Missing
georeferencedDate has 604625 (> 99.9%) missing values Missing
georeferenceProtocol has 366755 (60.7%) missing values Missing
georeferenceSources has 604624 (> 99.9%) missing values Missing
georeferenceRemarks has 596178 (98.6%) missing values Missing
latestEonOrHighestEonothem has 604624 (> 99.9%) missing values Missing
earliestEraOrLowestErathem has 604624 (> 99.9%) missing values Missing
latestEraOrHighestErathem has 604624 (> 99.9%) missing values Missing
earliestPeriodOrLowestSystem has 604624 (> 99.9%) missing values Missing
latestPeriodOrHighestSystem has 604624 (> 99.9%) missing values Missing
latestEpochOrHighestSeries has 604622 (> 99.9%) missing values Missing
earliestAgeOrLowestStage has 604624 (> 99.9%) missing values Missing
highestBiostratigraphicZone has 604624 (> 99.9%) missing values Missing
lithostratigraphicTerms has 604622 (> 99.9%) missing values Missing
group has 604625 (> 99.9%) missing values Missing
formation has 604625 (> 99.9%) missing values Missing
member has 604624 (> 99.9%) missing values Missing
bed has 604624 (> 99.9%) missing values Missing
verbatimIdentification has 604624 (> 99.9%) missing values Missing
identificationQualifier has 603189 (99.8%) missing values Missing
typeStatus has 486591 (80.5%) missing values Missing
identifiedBy has 454955 (75.2%) missing values Missing
identifiedByID has 604624 (> 99.9%) missing values Missing
identificationVerificationStatus has 604622 (> 99.9%) missing values Missing
identificationRemarks has 604622 (> 99.9%) missing values Missing
taxonID has 604624 (> 99.9%) missing values Missing
namePublishedInID has 604624 (> 99.9%) missing values Missing
taxonConceptID has 604625 (> 99.9%) missing values Missing
acceptedNameUsage has 604624 (> 99.9%) missing values Missing
parentNameUsage has 604623 (> 99.9%) missing values Missing
originalNameUsage has 604624 (> 99.9%) missing values Missing
nameAccordingTo has 604624 (> 99.9%) missing values Missing
namePublishedIn has 604624 (> 99.9%) missing values Missing
namePublishedInYear has 604624 (> 99.9%) missing values Missing
superfamily has 604624 (> 99.9%) missing values Missing
family has 11642 (1.9%) missing values Missing
subfamily has 604624 (> 99.9%) missing values Missing
subtribe has 604624 (> 99.9%) missing values Missing
genus has 19883 (3.3%) missing values Missing
genericName has 19882 (3.3%) missing values Missing
subgenus has 604624 (> 99.9%) missing values Missing
specificEpithet has 109508 (18.1%) missing values Missing
infraspecificEpithet has 586367 (97.0%) missing values Missing
cultivarEpithet has 604624 (> 99.9%) missing values Missing
verbatimTaxonRank has 604625 (> 99.9%) missing values Missing
vernacularName has 604624 (> 99.9%) missing values Missing
nomenclaturalCode has 604625 (> 99.9%) missing values Missing
nomenclaturalStatus has 604625 (> 99.9%) missing values Missing
taxonRemarks has 604625 (> 99.9%) missing values Missing
elevation has 557870 (92.3%) missing values Missing
elevationAccuracy has 573282 (94.8%) missing values Missing
depth has 604592 (> 99.9%) missing values Missing
depthAccuracy has 604615 (> 99.9%) missing values Missing
distanceFromCentroidInMeters has 601631 (99.5%) missing values Missing
mediaType has 369838 (61.2%) missing values Missing
familyKey has 11642 (1.9%) missing values Missing
genusKey has 19883 (3.3%) missing values Missing
subgenusKey has 604624 (> 99.9%) missing values Missing
speciesKey has 109501 (18.1%) missing values Missing
species has 109503 (18.1%) missing values Missing
repatriated has 162658 (26.9%) missing values Missing
projectId has 604625 (> 99.9%) missing values Missing
gbifRegion has 163113 (27.0%) missing values Missing
level0Gid has 288722 (47.8%) missing values Missing
level0Name has 288722 (47.8%) missing values Missing
level1Gid has 288806 (47.8%) missing values Missing
level1Name has 288804 (47.8%) missing values Missing
level2Gid has 297499 (49.2%) missing values Missing
level2Name has 297510 (49.2%) missing values Missing
level3Gid has 540301 (89.4%) missing values Missing
level3Name has 541181 (89.5%) missing values Missing
iucnRedListCategory has 96088 (15.9%) missing values Missing
gbifID has unique values Unique
occurrenceID has unique values Unique

Reproduction

Analysis started2025-01-08 22:47:20.970730
Analysis finished2025-01-08 22:47:55.454125
Duration34.48 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

gbifID
Text

Unique 

Distinct604626
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:47:55.839061image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters6046260
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique604626 ?
Unique (%)100.0%

Sample

1st row1321729650
2nd row1320180785
3rd row4403931423
4th row1320185860
5th row1320185980
ValueCountFrequency (%)
1321729650 1
 
< 0.1%
1321751610 1
 
< 0.1%
1828939237 1
 
< 0.1%
1321753851 1
 
< 0.1%
4403917418 1
 
< 0.1%
1321742115 1
 
< 0.1%
4403931423 1
 
< 0.1%
1320185860 1
 
< 0.1%
1320185980 1
 
< 0.1%
2236094411 1
 
< 0.1%
Other values (604616) 604616
> 99.9%
2025-01-08T17:47:56.272450image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1132843
18.7%
3 860404
14.2%
2 781753
12.9%
0 530599
8.8%
8 513679
8.5%
9 488164
8.1%
7 473950
7.8%
4 451737
 
7.5%
5 410650
 
6.8%
6 402481
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6046260
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1132843
18.7%
3 860404
14.2%
2 781753
12.9%
0 530599
8.8%
8 513679
8.5%
9 488164
8.1%
7 473950
7.8%
4 451737
 
7.5%
5 410650
 
6.8%
6 402481
 
6.7%

Most occurring scripts

ValueCountFrequency (%)
Common 6046260
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1132843
18.7%
3 860404
14.2%
2 781753
12.9%
0 530599
8.8%
8 513679
8.5%
9 488164
8.1%
7 473950
7.8%
4 451737
 
7.5%
5 410650
 
6.8%
6 402481
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6046260
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1132843
18.7%
3 860404
14.2%
2 781753
12.9%
0 530599
8.8%
8 513679
8.5%
9 488164
8.1%
7 473950
7.8%
4 451737
 
7.5%
5 410650
 
6.8%
6 402481
 
6.7%

license
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:47:56.324449image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters4232382
Distinct characters4
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCC0_1_0
2nd rowCC0_1_0
3rd rowCC0_1_0
4th rowCC0_1_0
5th rowCC0_1_0
ValueCountFrequency (%)
cc0_1_0 604626
100.0%
2025-01-08T17:47:56.499757image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 1209252
28.6%
0 1209252
28.6%
_ 1209252
28.6%
1 604626
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1813878
42.9%
Uppercase Letter 1209252
28.6%
Connector Punctuation 1209252
28.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1209252
66.7%
1 604626
33.3%
Uppercase Letter
ValueCountFrequency (%)
C 1209252
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1209252
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3023130
71.4%
Latin 1209252
 
28.6%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1209252
40.0%
_ 1209252
40.0%
1 604626
20.0%
Latin
ValueCountFrequency (%)
C 1209252
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4232382
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 1209252
28.6%
0 1209252
28.6%
_ 1209252
28.6%
1 604626
14.3%
Distinct56588
Distinct (%)9.4%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:47:56.664257image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters12092520
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30778 ?
Unique (%)5.1%

Sample

1st row2013-09-16T11:56:00Z
2nd row2016-06-09T14:33:00Z
3rd row2023-08-23T09:36:00Z
4th row2023-05-19T10:32:00Z
5th row2015-10-05T15:58:00Z
ValueCountFrequency (%)
2017-04-17t11:48:00z 9681
 
1.6%
2017-04-17t11:49:00z 9420
 
1.6%
2017-04-17t11:50:00z 8719
 
1.4%
2017-04-17t11:47:00z 8654
 
1.4%
2017-04-17t11:46:00z 6000
 
1.0%
2021-08-23t15:49:00z 3095
 
0.5%
2021-08-23t15:48:00z 3057
 
0.5%
2016-07-27t14:05:00z 3041
 
0.5%
2016-07-27t14:06:00z 1844
 
0.3%
2021-08-23t15:50:00z 1737
 
0.3%
Other values (56578) 549378
90.9%
2025-01-08T17:47:56.844342image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2927853
24.2%
1 1566143
13.0%
2 1372338
11.3%
- 1209252
10.0%
: 1209252
10.0%
T 604626
 
5.0%
Z 604626
 
5.0%
3 593038
 
4.9%
5 494587
 
4.1%
4 456514
 
3.8%
Other values (4) 1054291
 
8.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8464764
70.0%
Dash Punctuation 1209252
 
10.0%
Other Punctuation 1209252
 
10.0%
Uppercase Letter 1209252
 
10.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2927853
34.6%
1 1566143
18.5%
2 1372338
16.2%
3 593038
 
7.0%
5 494587
 
5.8%
4 456514
 
5.4%
9 314284
 
3.7%
7 310920
 
3.7%
6 238170
 
2.8%
8 190917
 
2.3%
Uppercase Letter
ValueCountFrequency (%)
T 604626
50.0%
Z 604626
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 1209252
100.0%
Other Punctuation
ValueCountFrequency (%)
: 1209252
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10883268
90.0%
Latin 1209252
 
10.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2927853
26.9%
1 1566143
14.4%
2 1372338
12.6%
- 1209252
11.1%
: 1209252
11.1%
3 593038
 
5.4%
5 494587
 
4.5%
4 456514
 
4.2%
9 314284
 
2.9%
7 310920
 
2.9%
Other values (2) 429087
 
3.9%
Latin
ValueCountFrequency (%)
T 604626
50.0%
Z 604626
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12092520
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2927853
24.2%
1 1566143
13.0%
2 1372338
11.3%
- 1209252
10.0%
: 1209252
10.0%
T 604626
 
5.0%
Z 604626
 
5.0%
3 593038
 
4.9%
5 494587
 
4.1%
4 456514
 
3.8%
Other values (4) 1054291
 
8.7%

publisher
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:47:56.908345image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length59
Median length59
Mean length59
Min length59

Characters and Unicode

Total characters35672934
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNational Museum of Natural History, Smithsonian Institution
2nd rowNational Museum of Natural History, Smithsonian Institution
3rd rowNational Museum of Natural History, Smithsonian Institution
4th rowNational Museum of Natural History, Smithsonian Institution
5th rowNational Museum of Natural History, Smithsonian Institution
ValueCountFrequency (%)
national 604626
14.3%
museum 604626
14.3%
of 604626
14.3%
natural 604626
14.3%
history 604626
14.3%
smithsonian 604626
14.3%
institution 604626
14.3%
2025-01-08T17:47:57.008136image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 4232382
11.9%
i 3627756
10.2%
3627756
10.2%
a 3023130
 
8.5%
o 3023130
 
8.5%
n 3023130
 
8.5%
s 2418504
 
6.8%
u 2418504
 
6.8%
r 1209252
 
3.4%
m 1209252
 
3.4%
Other values (11) 7860138
22.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 27812796
78.0%
Space Separator 3627756
 
10.2%
Uppercase Letter 3627756
 
10.2%
Other Punctuation 604626
 
1.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 4232382
15.2%
i 3627756
13.0%
a 3023130
10.9%
o 3023130
10.9%
n 3023130
10.9%
s 2418504
8.7%
u 2418504
8.7%
r 1209252
 
4.3%
m 1209252
 
4.3%
l 1209252
 
4.3%
Other values (4) 2418504
8.7%
Uppercase Letter
ValueCountFrequency (%)
N 1209252
33.3%
M 604626
16.7%
H 604626
16.7%
S 604626
16.7%
I 604626
16.7%
Space Separator
ValueCountFrequency (%)
3627756
100.0%
Other Punctuation
ValueCountFrequency (%)
, 604626
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 31440552
88.1%
Common 4232382
 
11.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 4232382
13.5%
i 3627756
11.5%
a 3023130
9.6%
o 3023130
9.6%
n 3023130
9.6%
s 2418504
 
7.7%
u 2418504
 
7.7%
r 1209252
 
3.8%
m 1209252
 
3.8%
N 1209252
 
3.8%
Other values (9) 6046260
19.2%
Common
ValueCountFrequency (%)
3627756
85.7%
, 604626
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35672934
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 4232382
11.9%
i 3627756
10.2%
3627756
10.2%
a 3023130
 
8.5%
o 3023130
 
8.5%
n 3023130
 
8.5%
s 2418504
 
6.8%
u 2418504
 
6.8%
r 1209252
 
3.4%
m 1209252
 
3.4%
Other values (11) 7860138
22.0%

institutionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:47:57.059382image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length29
Mean length29
Min length29

Characters and Unicode

Total characters17534154
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:lsid:biocol.org:col:34871
2nd rowurn:lsid:biocol.org:col:34871
3rd rowurn:lsid:biocol.org:col:34871
4th rowurn:lsid:biocol.org:col:34871
5th rowurn:lsid:biocol.org:col:34871
ValueCountFrequency (%)
urn:lsid:biocol.org:col:34871 604626
100.0%
2025-01-08T17:47:57.155704image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 2418504
13.8%
: 2418504
13.8%
l 1813878
 
10.3%
i 1209252
 
6.9%
r 1209252
 
6.9%
c 1209252
 
6.9%
g 604626
 
3.4%
7 604626
 
3.4%
8 604626
 
3.4%
4 604626
 
3.4%
Other values (8) 4837008
27.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11487894
65.5%
Other Punctuation 3023130
 
17.2%
Decimal Number 3023130
 
17.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 2418504
21.1%
l 1813878
15.8%
i 1209252
10.5%
r 1209252
10.5%
c 1209252
10.5%
g 604626
 
5.3%
u 604626
 
5.3%
b 604626
 
5.3%
d 604626
 
5.3%
s 604626
 
5.3%
Decimal Number
ValueCountFrequency (%)
7 604626
20.0%
8 604626
20.0%
4 604626
20.0%
3 604626
20.0%
1 604626
20.0%
Other Punctuation
ValueCountFrequency (%)
: 2418504
80.0%
. 604626
 
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11487894
65.5%
Common 6046260
34.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 2418504
21.1%
l 1813878
15.8%
i 1209252
10.5%
r 1209252
10.5%
c 1209252
10.5%
g 604626
 
5.3%
u 604626
 
5.3%
b 604626
 
5.3%
d 604626
 
5.3%
s 604626
 
5.3%
Common
ValueCountFrequency (%)
: 2418504
40.0%
7 604626
 
10.0%
8 604626
 
10.0%
4 604626
 
10.0%
3 604626
 
10.0%
. 604626
 
10.0%
1 604626
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17534154
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 2418504
13.8%
: 2418504
13.8%
l 1813878
 
10.3%
i 1209252
 
6.9%
r 1209252
 
6.9%
c 1209252
 
6.9%
g 604626
 
3.4%
7 604626
 
3.4%
8 604626
 
3.4%
4 604626
 
3.4%
Other values (8) 4837008
27.6%

collectionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:47:57.207503image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length45
Mean length45
Min length45

Characters and Unicode

Total characters27208170
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:uuid:18e3cd08-a962-4f0a-b72c-9a0b3600c5ad
2nd rowurn:uuid:18e3cd08-a962-4f0a-b72c-9a0b3600c5ad
3rd rowurn:uuid:18e3cd08-a962-4f0a-b72c-9a0b3600c5ad
4th rowurn:uuid:18e3cd08-a962-4f0a-b72c-9a0b3600c5ad
5th rowurn:uuid:18e3cd08-a962-4f0a-b72c-9a0b3600c5ad
ValueCountFrequency (%)
urn:uuid:18e3cd08-a962-4f0a-b72c-9a0b3600c5ad 604626
100.0%
2025-01-08T17:47:57.306183image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 3023130
 
11.1%
a 2418504
 
8.9%
- 2418504
 
8.9%
d 1813878
 
6.7%
c 1813878
 
6.7%
u 1813878
 
6.7%
8 1209252
 
4.4%
3 1209252
 
4.4%
: 1209252
 
4.4%
9 1209252
 
4.4%
Other values (12) 9069390
33.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12092520
44.4%
Decimal Number 11487894
42.2%
Dash Punctuation 2418504
 
8.9%
Other Punctuation 1209252
 
4.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3023130
26.3%
8 1209252
 
10.5%
3 1209252
 
10.5%
9 1209252
 
10.5%
6 1209252
 
10.5%
2 1209252
 
10.5%
1 604626
 
5.3%
4 604626
 
5.3%
7 604626
 
5.3%
5 604626
 
5.3%
Lowercase Letter
ValueCountFrequency (%)
a 2418504
20.0%
d 1813878
15.0%
c 1813878
15.0%
u 1813878
15.0%
b 1209252
10.0%
e 604626
 
5.0%
i 604626
 
5.0%
r 604626
 
5.0%
n 604626
 
5.0%
f 604626
 
5.0%
Dash Punctuation
ValueCountFrequency (%)
- 2418504
100.0%
Other Punctuation
ValueCountFrequency (%)
: 1209252
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 15115650
55.6%
Latin 12092520
44.4%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3023130
20.0%
- 2418504
16.0%
8 1209252
 
8.0%
3 1209252
 
8.0%
: 1209252
 
8.0%
9 1209252
 
8.0%
6 1209252
 
8.0%
2 1209252
 
8.0%
1 604626
 
4.0%
4 604626
 
4.0%
Other values (2) 1209252
 
8.0%
Latin
ValueCountFrequency (%)
a 2418504
20.0%
d 1813878
15.0%
c 1813878
15.0%
u 1813878
15.0%
b 1209252
10.0%
e 604626
 
5.0%
i 604626
 
5.0%
r 604626
 
5.0%
n 604626
 
5.0%
f 604626
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27208170
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3023130
 
11.1%
a 2418504
 
8.9%
- 2418504
 
8.9%
d 1813878
 
6.7%
c 1813878
 
6.7%
u 1813878
 
6.7%
8 1209252
 
4.4%
3 1209252
 
4.4%
: 1209252
 
4.4%
9 1209252
 
4.4%
Other values (12) 9069390
33.3%

institutionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:47:57.345201image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters2418504
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUSNM
2nd rowUSNM
3rd rowUSNM
4th rowUSNM
5th rowUSNM
ValueCountFrequency (%)
usnm 604626
100.0%
2025-01-08T17:47:57.434205image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 604626
25.0%
S 604626
25.0%
N 604626
25.0%
M 604626
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2418504
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 604626
25.0%
S 604626
25.0%
N 604626
25.0%
M 604626
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2418504
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 604626
25.0%
S 604626
25.0%
N 604626
25.0%
M 604626
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2418504
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 604626
25.0%
S 604626
25.0%
N 604626
25.0%
M 604626
25.0%

collectionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:47:57.471205image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1813878
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowENT
2nd rowENT
3rd rowENT
4th rowENT
5th rowENT
ValueCountFrequency (%)
ent 604626
100.0%
2025-01-08T17:47:57.561103image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 604626
33.3%
N 604626
33.3%
T 604626
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1813878
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 604626
33.3%
N 604626
33.3%
T 604626
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 1813878
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 604626
33.3%
N 604626
33.3%
T 604626
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1813878
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 604626
33.3%
N 604626
33.3%
T 604626
33.3%

datasetName
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:47:57.602104image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters11487894
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNMNH Extant Biology
2nd rowNMNH Extant Biology
3rd rowNMNH Extant Biology
4th rowNMNH Extant Biology
5th rowNMNH Extant Biology
ValueCountFrequency (%)
nmnh 604626
33.3%
extant 604626
33.3%
biology 604626
33.3%
2025-01-08T17:47:57.697463image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 1209252
 
10.5%
1209252
 
10.5%
t 1209252
 
10.5%
o 1209252
 
10.5%
M 604626
 
5.3%
H 604626
 
5.3%
E 604626
 
5.3%
x 604626
 
5.3%
a 604626
 
5.3%
n 604626
 
5.3%
Other values (5) 3023130
26.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6650886
57.9%
Uppercase Letter 3627756
31.6%
Space Separator 1209252
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 1209252
18.2%
o 1209252
18.2%
x 604626
9.1%
a 604626
9.1%
n 604626
9.1%
i 604626
9.1%
l 604626
9.1%
g 604626
9.1%
y 604626
9.1%
Uppercase Letter
ValueCountFrequency (%)
N 1209252
33.3%
M 604626
16.7%
H 604626
16.7%
E 604626
16.7%
B 604626
16.7%
Space Separator
ValueCountFrequency (%)
1209252
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10278642
89.5%
Common 1209252
 
10.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 1209252
11.8%
t 1209252
11.8%
o 1209252
11.8%
M 604626
 
5.9%
H 604626
 
5.9%
E 604626
 
5.9%
x 604626
 
5.9%
a 604626
 
5.9%
n 604626
 
5.9%
B 604626
 
5.9%
Other values (4) 2418504
23.5%
Common
ValueCountFrequency (%)
1209252
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11487894
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 1209252
 
10.5%
1209252
 
10.5%
t 1209252
 
10.5%
o 1209252
 
10.5%
M 604626
 
5.3%
H 604626
 
5.3%
E 604626
 
5.3%
x 604626
 
5.3%
a 604626
 
5.3%
n 604626
 
5.3%
Other values (5) 3023130
26.3%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:47:57.747057image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length18
Mean length17.99374986
Min length17

Characters and Unicode

Total characters10879489
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRESERVED_SPECIMEN
2nd rowPRESERVED_SPECIMEN
3rd rowPRESERVED_SPECIMEN
4th rowPRESERVED_SPECIMEN
5th rowPRESERVED_SPECIMEN
ValueCountFrequency (%)
preserved_specimen 600847
99.4%
human_observation 3779
 
0.6%
2025-01-08T17:47:57.841669image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 3008014
27.6%
R 1205473
11.1%
S 1205473
11.1%
P 1201694
 
11.0%
N 608405
 
5.6%
M 604626
 
5.6%
I 604626
 
5.6%
_ 604626
 
5.6%
V 604626
 
5.6%
C 600847
 
5.5%
Other values (7) 631079
 
5.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 10274863
94.4%
Connector Punctuation 604626
 
5.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 3008014
29.3%
R 1205473
11.7%
S 1205473
11.7%
P 1201694
 
11.7%
N 608405
 
5.9%
M 604626
 
5.9%
I 604626
 
5.9%
V 604626
 
5.9%
C 600847
 
5.8%
D 600847
 
5.8%
Other values (6) 30232
 
0.3%
Connector Punctuation
ValueCountFrequency (%)
_ 604626
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10274863
94.4%
Common 604626
 
5.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 3008014
29.3%
R 1205473
11.7%
S 1205473
11.7%
P 1201694
 
11.7%
N 608405
 
5.9%
M 604626
 
5.9%
I 604626
 
5.9%
V 604626
 
5.9%
C 600847
 
5.8%
D 600847
 
5.8%
Other values (6) 30232
 
0.3%
Common
ValueCountFrequency (%)
_ 604626
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10879489
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 3008014
27.6%
R 1205473
11.1%
S 1205473
11.1%
P 1201694
 
11.0%
N 608405
 
5.6%
M 604626
 
5.6%
I 604626
 
5.6%
_ 604626
 
5.6%
V 604626
 
5.6%
C 600847
 
5.5%
Other values (7) 631079
 
5.8%

occurrenceID
Text

Unique 

Distinct604626
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:47:58.147259image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length63
Median length63
Mean length63
Min length63

Characters and Unicode

Total characters38091438
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique604626 ?
Unique (%)100.0%

Sample

1st rowhttp://n2t.net/ark:/65665/3c83a10d1-1e59-4b08-af5b-28d12d2d0c80
2nd rowhttp://n2t.net/ark:/65665/383bb510d-d5ce-4c09-b4c4-bc1482fbaf28
3rd rowhttp://n2t.net/ark:/65665/383f13aa6-a5b6-40bc-bddc-b42c557aebfc
4th rowhttp://n2t.net/ark:/65665/383f4d560-c2d2-485c-906c-b6dad303fd7a
5th rowhttp://n2t.net/ark:/65665/383f634da-bb58-423c-85f4-a267b04c5ee5
ValueCountFrequency (%)
http://n2t.net/ark:/65665/3c83a10d1-1e59-4b08-af5b-28d12d2d0c80 1
 
< 0.1%
http://n2t.net/ark:/65665/3c932a059-56b2-4846-9e97-741d7bdde29c 1
 
< 0.1%
http://n2t.net/ark:/65665/384cb9f0c-76d8-41b2-9a2e-351c10a4ab3f 1
 
< 0.1%
http://n2t.net/ark:/65665/3c94d744a-d127-4564-9b0c-5d349a138dd0 1
 
< 0.1%
http://n2t.net/ark:/65665/384c3715b-7768-468a-b76b-a68ff7a554d0 1
 
< 0.1%
http://n2t.net/ark:/65665/3c8c6462b-a9e9-4efa-9205-6fb4e5ef4e65 1
 
< 0.1%
http://n2t.net/ark:/65665/383f13aa6-a5b6-40bc-bddc-b42c557aebfc 1
 
< 0.1%
http://n2t.net/ark:/65665/383f4d560-c2d2-485c-906c-b6dad303fd7a 1
 
< 0.1%
http://n2t.net/ark:/65665/383f634da-bb58-423c-85f4-a267b04c5ee5 1
 
< 0.1%
http://n2t.net/ark:/65665/3c898aee2-d463-49d7-ad9c-6fd423e170e1 1
 
< 0.1%
Other values (604616) 604616
> 99.9%
2025-01-08T17:47:58.506411image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 3023130
 
7.9%
6 2949272
 
7.7%
- 2418504
 
6.3%
t 2418504
 
6.3%
5 2343156
 
6.2%
a 1889243
 
5.0%
2 1738952
 
4.6%
e 1738278
 
4.6%
3 1737371
 
4.6%
4 1737249
 
4.6%
Other values (16) 16097779
42.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16478170
43.3%
Lowercase Letter 14357756
37.7%
Other Punctuation 4837008
 
12.7%
Dash Punctuation 2418504
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 2418504
16.8%
a 1889243
13.2%
e 1738278
12.1%
b 1285981
9.0%
n 1209252
8.4%
d 1134303
7.9%
c 1132879
7.9%
f 1130812
7.9%
k 604626
 
4.2%
r 604626
 
4.2%
Other values (2) 1209252
8.4%
Decimal Number
ValueCountFrequency (%)
6 2949272
17.9%
5 2343156
14.2%
2 1738952
10.6%
3 1737371
10.5%
4 1737249
10.5%
8 1286190
7.8%
9 1284662
7.8%
0 1134061
 
6.9%
1 1133861
 
6.9%
7 1133396
 
6.9%
Other Punctuation
ValueCountFrequency (%)
/ 3023130
62.5%
: 1209252
 
25.0%
. 604626
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 2418504
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 23733682
62.3%
Latin 14357756
37.7%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 3023130
12.7%
6 2949272
12.4%
- 2418504
10.2%
5 2343156
9.9%
2 1738952
7.3%
3 1737371
7.3%
4 1737249
7.3%
8 1286190
 
5.4%
9 1284662
 
5.4%
: 1209252
 
5.1%
Other values (4) 4005944
16.9%
Latin
ValueCountFrequency (%)
t 2418504
16.8%
a 1889243
13.2%
e 1738278
12.1%
b 1285981
9.0%
n 1209252
8.4%
d 1134303
7.9%
c 1132879
7.9%
f 1130812
7.9%
k 604626
 
4.2%
r 604626
 
4.2%
Other values (2) 1209252
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 38091438
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 3023130
 
7.9%
6 2949272
 
7.7%
- 2418504
 
6.3%
t 2418504
 
6.3%
5 2343156
 
6.2%
a 1889243
 
5.0%
2 1738952
 
4.6%
e 1738278
 
4.6%
3 1737371
 
4.6%
4 1737249
 
4.6%
Other values (16) 16097779
42.3%

catalogNumber
Text

Missing 

Distinct371195
Distinct (%)> 99.9%
Missing233418
Missing (%)38.6%
Memory size4.6 MiB
2025-01-08T17:47:58.748096image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length15
Mean length15.03873031
Min length12

Characters and Unicode

Total characters5582497
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique371182 ?
Unique (%)> 99.9%

Sample

1st rowUSNMENT00831303
2nd rowUSNMENT00356408
3rd rowUSNMENT01436172
4th rowUSNMENT00357025
5th rowUSNMENT00314717
ValueCountFrequency (%)
usnment00937212 2
 
< 0.1%
usnment01200936 2
 
< 0.1%
usnment00385731 2
 
< 0.1%
usnment00937219 2
 
< 0.1%
usnment00935890 2
 
< 0.1%
usnment00533165 2
 
< 0.1%
usnment00377587 2
 
< 0.1%
usnment00937222 2
 
< 0.1%
usnment00937214 2
 
< 0.1%
usnment00381323 2
 
< 0.1%
Other values (371185) 371188
> 99.9%
2025-01-08T17:47:59.042361image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 804605
14.4%
N 741758
13.3%
1 376970
 
6.8%
S 371208
 
6.6%
U 371164
 
6.6%
M 371164
 
6.6%
E 370588
 
6.6%
T 370588
 
6.6%
3 302793
 
5.4%
4 225899
 
4.0%
Other values (11) 1275760
22.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2981860
53.4%
Uppercase Letter 2596558
46.5%
Other Punctuation 4077
 
0.1%
Lowercase Letter 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 804605
27.0%
1 376970
12.6%
3 302793
 
10.2%
4 225899
 
7.6%
2 225441
 
7.6%
5 215950
 
7.2%
8 215550
 
7.2%
7 210801
 
7.1%
6 202403
 
6.8%
9 201448
 
6.8%
Uppercase Letter
ValueCountFrequency (%)
N 741758
28.6%
S 371208
14.3%
U 371164
14.3%
M 371164
14.3%
E 370588
14.3%
T 370588
14.3%
C 44
 
< 0.1%
A 44
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
b 1
50.0%
a 1
50.0%
Other Punctuation
ValueCountFrequency (%)
. 4077
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2985937
53.5%
Latin 2596560
46.5%

Most frequent character per script

Common
ValueCountFrequency (%)
0 804605
26.9%
1 376970
12.6%
3 302793
 
10.1%
4 225899
 
7.6%
2 225441
 
7.6%
5 215950
 
7.2%
8 215550
 
7.2%
7 210801
 
7.1%
6 202403
 
6.8%
9 201448
 
6.7%
Latin
ValueCountFrequency (%)
N 741758
28.6%
S 371208
14.3%
U 371164
14.3%
M 371164
14.3%
E 370588
14.3%
T 370588
14.3%
C 44
 
< 0.1%
A 44
 
< 0.1%
b 1
 
< 0.1%
a 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5582497
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 804605
14.4%
N 741758
13.3%
1 376970
 
6.8%
S 371208
 
6.6%
U 371164
 
6.6%
M 371164
 
6.6%
E 370588
 
6.6%
T 370588
 
6.6%
3 302793
 
5.4%
4 225899
 
4.0%
Other values (11) 1275760
22.9%

recordNumber
Text

Missing 

Distinct33
Distinct (%)89.2%
Missing604589
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:47:59.139738image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length43
Median length26
Mean length17.18918919
Min length4

Characters and Unicode

Total characters636
Distinct characters57
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)86.5%

Sample

1st rowCollection number "14,957"
2nd rowLot 607, Sub 182
3rd row4012
4th rowDognin Collection
5th row12.097
ValueCountFrequency (%)
collection 10
 
10.0%
no 9
 
9.0%
walsingham 7
 
7.0%
dognin 5
 
5.0%
hopkins 3
 
3.0%
quaintance 2
 
2.0%
wlsm 2
 
2.0%
townes 2
 
2.0%
number 2
 
2.0%
from 2
 
2.0%
Other values (56) 56
56.0%
2025-01-08T17:47:59.293068image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
63
 
9.9%
o 52
 
8.2%
n 47
 
7.4%
l 39
 
6.1%
i 33
 
5.2%
. 26
 
4.1%
e 25
 
3.9%
a 22
 
3.5%
t 19
 
3.0%
1 19
 
3.0%
Other values (47) 291
45.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 348
54.7%
Decimal Number 114
 
17.9%
Uppercase Letter 67
 
10.5%
Space Separator 63
 
9.9%
Other Punctuation 40
 
6.3%
Dash Punctuation 2
 
0.3%
Open Punctuation 1
 
0.2%
Close Punctuation 1
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 52
14.9%
n 47
13.5%
l 39
11.2%
i 33
9.5%
e 25
 
7.2%
a 22
 
6.3%
t 19
 
5.5%
c 18
 
5.2%
s 16
 
4.6%
g 14
 
4.0%
Other values (11) 63
18.1%
Uppercase Letter
ValueCountFrequency (%)
C 14
20.9%
W 9
13.4%
N 9
13.4%
H 6
9.0%
D 5
 
7.5%
S 4
 
6.0%
M 3
 
4.5%
Q 2
 
3.0%
T 2
 
3.0%
U 2
 
3.0%
Other values (9) 11
16.4%
Decimal Number
ValueCountFrequency (%)
1 19
16.7%
7 15
13.2%
0 14
12.3%
8 14
12.3%
5 12
10.5%
9 12
10.5%
4 8
7.0%
2 8
7.0%
6 7
 
6.1%
3 5
 
4.4%
Other Punctuation
ValueCountFrequency (%)
. 26
65.0%
" 12
30.0%
, 2
 
5.0%
Space Separator
ValueCountFrequency (%)
63
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 415
65.3%
Common 221
34.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 52
 
12.5%
n 47
 
11.3%
l 39
 
9.4%
i 33
 
8.0%
e 25
 
6.0%
a 22
 
5.3%
t 19
 
4.6%
c 18
 
4.3%
s 16
 
3.9%
C 14
 
3.4%
Other values (30) 130
31.3%
Common
ValueCountFrequency (%)
63
28.5%
. 26
11.8%
1 19
 
8.6%
7 15
 
6.8%
0 14
 
6.3%
8 14
 
6.3%
5 12
 
5.4%
" 12
 
5.4%
9 12
 
5.4%
4 8
 
3.6%
Other values (7) 26
11.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 636
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
63
 
9.9%
o 52
 
8.2%
n 47
 
7.4%
l 39
 
6.1%
i 33
 
5.2%
. 26
 
4.1%
e 25
 
3.9%
a 22
 
3.5%
t 19
 
3.0%
1 19
 
3.0%
Other values (47) 291
45.8%

recordedBy
Text

Missing 

Distinct18726
Distinct (%)4.7%
Missing203336
Missing (%)33.6%
Memory size4.6 MiB
2025-01-08T17:47:59.471311image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length90
Median length84
Mean length11.25684667
Min length1

Characters and Unicode

Total characters4517260
Distinct characters83
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9104 ?
Unique (%)2.3%

Sample

1st rowM. Ortiz B.
2nd row[Not Stated]
3rd rowS. Roble
4th row[Not Stated]
5th rowC. Flint
ValueCountFrequency (%)
not 65711
 
7.2%
stated 65695
 
7.2%
l 40182
 
4.4%
39875
 
4.4%
j 36886
 
4.0%
macior 31232
 
3.4%
d 28468
 
3.1%
c 27156
 
3.0%
r 25636
 
2.8%
b 22044
 
2.4%
Other values (10691) 530776
58.1%
2025-01-08T17:47:59.715651image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
512371
 
11.3%
. 355530
 
7.9%
t 305132
 
6.8%
a 299337
 
6.6%
e 290066
 
6.4%
o 240179
 
5.3%
r 229270
 
5.1%
i 173763
 
3.8%
n 169850
 
3.8%
l 136863
 
3.0%
Other values (73) 1804899
40.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2587572
57.3%
Uppercase Letter 878220
 
19.4%
Space Separator 512371
 
11.3%
Other Punctuation 405394
 
9.0%
Open Punctuation 65746
 
1.5%
Close Punctuation 65746
 
1.5%
Dash Punctuation 2190
 
< 0.1%
Decimal Number 21
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 305132
11.8%
a 299337
11.6%
e 290066
11.2%
o 240179
9.3%
r 229270
8.9%
i 173763
 
6.7%
n 169850
 
6.6%
l 136863
 
5.3%
d 115259
 
4.5%
s 95753
 
3.7%
Other values (25) 532100
20.6%
Uppercase Letter
ValueCountFrequency (%)
S 116393
13.3%
M 90618
 
10.3%
N 79756
 
9.1%
B 56903
 
6.5%
C 54336
 
6.2%
L 51912
 
5.9%
D 47327
 
5.4%
J 42554
 
4.8%
W 40148
 
4.6%
G 38218
 
4.4%
Other values (17) 260055
29.6%
Decimal Number
ValueCountFrequency (%)
1 8
38.1%
5 5
23.8%
2 2
 
9.5%
6 2
 
9.5%
0 2
 
9.5%
9 1
 
4.8%
3 1
 
4.8%
Other Punctuation
ValueCountFrequency (%)
. 355530
87.7%
& 39866
 
9.8%
, 9359
 
2.3%
' 622
 
0.2%
? 16
 
< 0.1%
/ 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
[ 65735
> 99.9%
( 10
 
< 0.1%
{ 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
] 65735
> 99.9%
) 10
 
< 0.1%
} 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
512371
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2190
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3465792
76.7%
Common 1051468
 
23.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 305132
 
8.8%
a 299337
 
8.6%
e 290066
 
8.4%
o 240179
 
6.9%
r 229270
 
6.6%
i 173763
 
5.0%
n 169850
 
4.9%
l 136863
 
3.9%
S 116393
 
3.4%
d 115259
 
3.3%
Other values (52) 1389680
40.1%
Common
ValueCountFrequency (%)
512371
48.7%
. 355530
33.8%
[ 65735
 
6.3%
] 65735
 
6.3%
& 39866
 
3.8%
, 9359
 
0.9%
- 2190
 
0.2%
' 622
 
0.1%
? 16
 
< 0.1%
( 10
 
< 0.1%
Other values (11) 34
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4516770
> 99.9%
None 490
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
512371
 
11.3%
. 355530
 
7.9%
t 305132
 
6.8%
a 299337
 
6.6%
e 290066
 
6.4%
o 240179
 
5.3%
r 229270
 
5.1%
i 173763
 
3.8%
n 169850
 
3.8%
l 136863
 
3.0%
Other values (63) 1804409
39.9%
None
ValueCountFrequency (%)
ñ 238
48.6%
ü 107
21.8%
á 95
 
19.4%
ä 13
 
2.7%
ö 12
 
2.4%
é 12
 
2.4%
ó 8
 
1.6%
Á 2
 
0.4%
č 2
 
0.4%
â 1
 
0.2%
Distinct941
Distinct (%)0.2%
Missing3136
Missing (%)0.5%
Memory size4.6 MiB
2025-01-08T17:47:59.880852image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length1
Mean length1.044865251
Min length1

Characters and Unicode

Total characters628476
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique393 ?
Unique (%)0.1%

Sample

1st row7
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
1 548219
91.1%
2 10272
 
1.7%
3 6617
 
1.1%
4 4294
 
0.7%
5 2621
 
0.4%
6 2340
 
0.4%
7 1822
 
0.3%
8 1526
 
0.3%
10 1306
 
0.2%
9 1254
 
0.2%
Other values (931) 21219
 
3.5%
2025-01-08T17:48:00.093611image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 560802
89.2%
2 17644
 
2.8%
3 11797
 
1.9%
4 8334
 
1.3%
5 6510
 
1.0%
0 6142
 
1.0%
6 5349
 
0.9%
7 4419
 
0.7%
8 3991
 
0.6%
9 3488
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 628476
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 560802
89.2%
2 17644
 
2.8%
3 11797
 
1.9%
4 8334
 
1.3%
5 6510
 
1.0%
0 6142
 
1.0%
6 5349
 
0.9%
7 4419
 
0.7%
8 3991
 
0.6%
9 3488
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Common 628476
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 560802
89.2%
2 17644
 
2.8%
3 11797
 
1.9%
4 8334
 
1.3%
5 6510
 
1.0%
0 6142
 
1.0%
6 5349
 
0.9%
7 4419
 
0.7%
8 3991
 
0.6%
9 3488
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 628476
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 560802
89.2%
2 17644
 
2.8%
3 11797
 
1.9%
4 8334
 
1.3%
5 6510
 
1.0%
0 6142
 
1.0%
6 5349
 
0.9%
7 4419
 
0.7%
8 3991
 
0.6%
9 3488
 
0.6%

sex
Text

Missing 

Distinct2
Distinct (%)< 0.1%
Missing384462
Missing (%)63.6%
Memory size4.6 MiB
2025-01-08T17:48:00.137291image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length4
Mean length4.79924965
Min length4

Characters and Unicode

Total characters1056622
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMALE
2nd rowMALE
3rd rowMALE
4th rowMALE
5th rowFEMALE
ValueCountFrequency (%)
male 132181
60.0%
female 87983
40.0%
2025-01-08T17:48:00.236578image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 308147
29.2%
M 220164
20.8%
A 220164
20.8%
L 220164
20.8%
F 87983
 
8.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1056622
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 308147
29.2%
M 220164
20.8%
A 220164
20.8%
L 220164
20.8%
F 87983
 
8.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 1056622
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 308147
29.2%
M 220164
20.8%
A 220164
20.8%
L 220164
20.8%
F 87983
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1056622
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 308147
29.2%
M 220164
20.8%
A 220164
20.8%
L 220164
20.8%
F 87983
 
8.3%

lifeStage
Text

Missing 

Distinct10
Distinct (%)< 0.1%
Missing184129
Missing (%)30.5%
Memory size4.6 MiB
2025-01-08T17:48:00.280578image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length5
Mean length5.02011905
Min length3

Characters and Unicode

Total characters2110945
Distinct characters29
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowAdult
2nd rowAdult
3rd rowAdult
4th rowAdult
5th rowAdult
ValueCountFrequency (%)
adult 415182
98.7%
immature 2800
 
0.7%
pupa 946
 
0.2%
larva 886
 
0.2%
unknown 490
 
0.1%
nymph 139
 
< 0.1%
egg 34
 
< 0.1%
deutonymph 17
 
< 0.1%
juvenile 2
 
< 0.1%
subadult 1
 
< 0.1%
2025-01-08T17:48:00.378092image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
u 418949
19.8%
t 418000
19.8%
l 415185
19.7%
d 415183
19.7%
A 415182
19.7%
m 5756
 
0.3%
a 5519
 
0.3%
r 3686
 
0.2%
e 2821
 
0.1%
I 2800
 
0.1%
Other values (19) 7864
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1690448
80.1%
Uppercase Letter 420497
 
19.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 418949
24.8%
t 418000
24.7%
l 415185
24.6%
d 415183
24.6%
m 5756
 
0.3%
a 5519
 
0.3%
r 3686
 
0.2%
e 2821
 
0.2%
n 1489
 
0.1%
p 1102
 
0.1%
Other values (9) 2758
 
0.2%
Uppercase Letter
ValueCountFrequency (%)
A 415182
98.7%
I 2800
 
0.7%
P 946
 
0.2%
L 886
 
0.2%
U 490
 
0.1%
N 139
 
< 0.1%
E 34
 
< 0.1%
D 17
 
< 0.1%
J 2
 
< 0.1%
S 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 2110945
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 418949
19.8%
t 418000
19.8%
l 415185
19.7%
d 415183
19.7%
A 415182
19.7%
m 5756
 
0.3%
a 5519
 
0.3%
r 3686
 
0.2%
e 2821
 
0.1%
I 2800
 
0.1%
Other values (19) 7864
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2110945
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
u 418949
19.8%
t 418000
19.8%
l 415185
19.7%
d 415183
19.7%
A 415182
19.7%
m 5756
 
0.3%
a 5519
 
0.3%
r 3686
 
0.2%
e 2821
 
0.1%
I 2800
 
0.1%
Other values (19) 7864
 
0.4%

occurrenceStatus
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-08T17:48:00.417393image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters4232382
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRESENT
2nd rowPRESENT
3rd rowPRESENT
4th rowPRESENT
5th rowPRESENT
ValueCountFrequency (%)
present 604626
100.0%
2025-01-08T17:48:00.502912image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 1209252
28.6%
P 604626
14.3%
R 604626
14.3%
S 604626
14.3%
N 604626
14.3%
T 604626
14.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4232382
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 1209252
28.6%
P 604626
14.3%
R 604626
14.3%
S 604626
14.3%
N 604626
14.3%
T 604626
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 4232382
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 1209252
28.6%
P 604626
14.3%
R 604626
14.3%
S 604626
14.3%
N 604626
14.3%
T 604626
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4232382
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 1209252
28.6%
P 604626
14.3%
R 604626
14.3%
S 604626
14.3%
N 604626
14.3%
T 604626
14.3%

preparations
Text

Missing 

Distinct272
Distinct (%)< 0.1%
Missing42051
Missing (%)7.0%
Memory size4.6 MiB
2025-01-08T17:48:00.563710image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length93
Median length6
Mean length6.839850687
Min length1

Characters and Unicode

Total characters3847929
Distinct characters58
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique112 ?
Unique (%)< 0.1%

Sample

1st rowPinned
2nd rowPinned
3rd rowPinned
4th rowEnvelope
5th rowPinned
ValueCountFrequency (%)
pinned 389733
63.9%
envelope 114672
 
18.8%
slide 65056
 
10.7%
vial 9495
 
1.6%
ethanol 6481
 
1.1%
section 3746
 
0.6%
on 3653
 
0.6%
3195
 
0.5%
ink 3151
 
0.5%
pen 3072
 
0.5%
Other values (93) 7800
 
1.3%
2025-01-08T17:48:00.685361image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 916431
23.8%
e 701114
18.2%
i 472644
12.3%
d 455886
11.8%
P 366191
 
9.5%
l 199752
 
5.2%
p 142785
 
3.7%
o 133876
 
3.5%
v 114834
 
3.0%
E 112885
 
2.9%
Other values (48) 231531
 
6.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3214047
83.5%
Uppercase Letter 553344
 
14.4%
Space Separator 47479
 
1.2%
Other Punctuation 32278
 
0.8%
Decimal Number 781
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 916431
28.5%
e 701114
21.8%
i 472644
14.7%
d 455886
14.2%
l 199752
 
6.2%
p 142785
 
4.4%
o 133876
 
4.2%
v 114834
 
3.6%
a 18594
 
0.6%
s 17524
 
0.5%
Other values (15) 40607
 
1.3%
Uppercase Letter
ValueCountFrequency (%)
P 366191
66.2%
E 112885
 
20.4%
S 56001
 
10.1%
V 9715
 
1.8%
I 3164
 
0.6%
B 2575
 
0.5%
R 887
 
0.2%
M 523
 
0.1%
C 505
 
0.1%
D 388
 
0.1%
Other values (10) 510
 
0.1%
Other Punctuation
ValueCountFrequency (%)
; 28578
88.5%
& 3195
 
9.9%
% 389
 
1.2%
. 69
 
0.2%
, 28
 
0.1%
/ 15
 
< 0.1%
? 4
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
5 389
49.8%
7 389
49.8%
2 1
 
0.1%
3 1
 
0.1%
9 1
 
0.1%
Space Separator
ValueCountFrequency (%)
47479
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3767391
97.9%
Common 80538
 
2.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 916431
24.3%
e 701114
18.6%
i 472644
12.5%
d 455886
12.1%
P 366191
 
9.7%
l 199752
 
5.3%
p 142785
 
3.8%
o 133876
 
3.6%
v 114834
 
3.0%
E 112885
 
3.0%
Other values (35) 150993
 
4.0%
Common
ValueCountFrequency (%)
47479
59.0%
; 28578
35.5%
& 3195
 
4.0%
5 389
 
0.5%
% 389
 
0.5%
7 389
 
0.5%
. 69
 
0.1%
, 28
 
< 0.1%
/ 15
 
< 0.1%
? 4
 
< 0.1%
Other values (3) 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3847929
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 916431
23.8%
e 701114
18.2%
i 472644
12.3%
d 455886
11.8%
P 366191
 
9.5%
l 199752
 
5.2%
p 142785
 
3.7%
o 133876
 
3.5%
v 114834
 
3.0%
E 112885
 
2.9%
Other values (48) 231531
 
6.0%

occurrenceRemarks
Text

Missing 

Distinct31232
Distinct (%)21.5%
Missing459276
Missing (%)76.0%
Memory size4.6 MiB
2025-01-08T17:48:00.857247image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length367359
Median length152440
Mean length80.11453732
Min length1

Characters and Unicode

Total characters11644648
Distinct characters126
Distinct categories17 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27501 ?
Unique (%)18.9%

Sample

1st rowOne leg removed for genetic sampling while on loan to GUELPH.
2nd rowLindroth, 1975:125: (the loc. is no doubt wrong).
3rd rowF. Monros Coll. 1959 G.M. Greene Coll. C. Schaeffer Coll. Shoemaker Coll. 1956 A. Nicolay Coll. 1950 L.W. Saylor Coll.
4th rowSpecimen data is incomplete. Phase 1 of data capture inlcluded USNMENT#s and general locality.
5th rowOne leg removed for genetic sampling while on loan to GUELPH.
ValueCountFrequency (%)
digitization 56218
 
3.3%
by 48162
 
2.8%
digital 44075
 
2.6%
volunteers 44039
 
2.6%
transcribed 44039
 
2.6%
of 43241
 
2.6%
on 41034
 
2.4%
to 36796
 
2.2%
loan 36495
 
2.2%
for 36258
 
2.1%
Other values (49844) 1263433
74.6%
2025-01-08T17:48:01.111506image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1504787
 
12.9%
e 838841
 
7.2%
i 811548
 
7.0%
a 687048
 
5.9%
t 675294
 
5.8%
o 659287
 
5.7%
n 620298
 
5.3%
r 558541
 
4.8%
s 454981
 
3.9%
l 435458
 
3.7%
Other values (116) 4398565
37.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8072373
69.3%
Space Separator 1504787
 
12.9%
Uppercase Letter 1147070
 
9.9%
Decimal Number 347393
 
3.0%
Other Punctuation 306192
 
2.6%
Control 139457
 
1.2%
Open Punctuation 40059
 
0.3%
Close Punctuation 40034
 
0.3%
Dash Punctuation 25457
 
0.2%
Math Symbol 11915
 
0.1%
Other values (7) 9911
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 838841
10.4%
i 811548
10.1%
a 687048
 
8.5%
t 675294
 
8.4%
o 659287
 
8.2%
n 620298
 
7.7%
r 558541
 
6.9%
s 454981
 
5.6%
l 435458
 
5.4%
d 321203
 
4.0%
Other values (30) 2009874
24.9%
Uppercase Letter
ValueCountFrequency (%)
P 119799
 
10.4%
S 109393
 
9.5%
O 106766
 
9.3%
E 97854
 
8.5%
D 76330
 
6.7%
I 72682
 
6.3%
T 72597
 
6.3%
M 68130
 
5.9%
U 60048
 
5.2%
L 53055
 
4.6%
Other values (21) 310416
27.1%
Other Punctuation
ValueCountFrequency (%)
. 176603
57.7%
; 48088
 
15.7%
, 34078
 
11.1%
: 20222
 
6.6%
# 9313
 
3.0%
/ 6742
 
2.2%
' 5289
 
1.7%
" 3442
 
1.1%
& 1718
 
0.6%
? 602
 
0.2%
Other values (7) 95
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 70694
20.3%
9 43824
12.6%
2 42022
12.1%
0 39802
11.5%
3 27396
 
7.9%
4 27182
 
7.8%
5 26126
 
7.5%
6 25057
 
7.2%
8 23473
 
6.8%
7 21817
 
6.3%
Math Symbol
ValueCountFrequency (%)
| 10552
88.6%
+ 720
 
6.0%
= 620
 
5.2%
> 11
 
0.1%
~ 8
 
0.1%
< 4
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
14
53.8%
° 7
26.9%
4
 
15.4%
© 1
 
3.8%
Open Punctuation
ValueCountFrequency (%)
( 33853
84.5%
[ 6195
 
15.5%
{ 11
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 33841
84.5%
] 6182
 
15.4%
} 11
 
< 0.1%
Control
ValueCountFrequency (%)
138832
99.6%
625
 
0.4%
Dash Punctuation
ValueCountFrequency (%)
- 25456
> 99.9%
1
 
< 0.1%
Currency Symbol
ValueCountFrequency (%)
$ 1
50.0%
£ 1
50.0%
Space Separator
ValueCountFrequency (%)
1504787
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 9827
100.0%
Initial Punctuation
ValueCountFrequency (%)
23
100.0%
Final Punctuation
ValueCountFrequency (%)
23
100.0%
Modifier Symbol
ValueCountFrequency (%)
^ 9
100.0%
Modifier Letter
ValueCountFrequency (%)
ʼ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9219441
79.2%
Common 2425207
 
20.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 838841
 
9.1%
i 811548
 
8.8%
a 687048
 
7.5%
t 675294
 
7.3%
o 659287
 
7.2%
n 620298
 
6.7%
r 558541
 
6.1%
s 454981
 
4.9%
l 435458
 
4.7%
d 321203
 
3.5%
Other values (60) 3156942
34.2%
Common
ValueCountFrequency (%)
1504787
62.0%
. 176603
 
7.3%
138832
 
5.7%
1 70694
 
2.9%
; 48088
 
2.0%
9 43824
 
1.8%
2 42022
 
1.7%
0 39802
 
1.6%
, 34078
 
1.4%
( 33853
 
1.4%
Other values (46) 292624
 
12.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11644446
> 99.9%
None 134
 
< 0.1%
Punctuation 49
 
< 0.1%
Misc Symbols 18
 
< 0.1%
Modifier Letters 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1504787
 
12.9%
e 838841
 
7.2%
i 811548
 
7.0%
a 687048
 
5.9%
t 675294
 
5.8%
o 659287
 
5.7%
n 620298
 
5.3%
r 558541
 
4.8%
s 454981
 
3.9%
l 435458
 
3.7%
Other values (85) 4398363
37.8%
None
ValueCountFrequency (%)
é 32
23.9%
á 22
16.4%
ü 20
14.9%
í 9
 
6.7%
ó 8
 
6.0%
· 7
 
5.2%
° 7
 
5.2%
ö 5
 
3.7%
ø 3
 
2.2%
É 2
 
1.5%
Other values (14) 19
14.2%
Punctuation
ValueCountFrequency (%)
23
46.9%
23
46.9%
2
 
4.1%
1
 
2.0%
Misc Symbols
ValueCountFrequency (%)
14
77.8%
4
 
22.2%
Modifier Letters
ValueCountFrequency (%)
ʼ 1
100.0%

verbatimLabel
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604625
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:01.160254image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters6
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row-11.7815
ValueCountFrequency (%)
11.7815 1
100.0%
2025-01-08T17:48:01.243673image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 3
37.5%
- 1
 
12.5%
. 1
 
12.5%
7 1
 
12.5%
8 1
 
12.5%
5 1
 
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
75.0%
Dash Punctuation 1
 
12.5%
Other Punctuation 1
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 3
50.0%
7 1
 
16.7%
8 1
 
16.7%
5 1
 
16.7%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 3
37.5%
- 1
 
12.5%
. 1
 
12.5%
7 1
 
12.5%
8 1
 
12.5%
5 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 3
37.5%
- 1
 
12.5%
. 1
 
12.5%
7 1
 
12.5%
8 1
 
12.5%
5 1
 
12.5%

materialSampleID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604625
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:01.285673image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters6
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row-76.7017
ValueCountFrequency (%)
76.7017 1
100.0%
2025-01-08T17:48:01.370210image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 3
37.5%
- 1
 
12.5%
6 1
 
12.5%
. 1
 
12.5%
0 1
 
12.5%
1 1
 
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
75.0%
Dash Punctuation 1
 
12.5%
Other Punctuation 1
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 3
50.0%
6 1
 
16.7%
0 1
 
16.7%
1 1
 
16.7%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
7 3
37.5%
- 1
 
12.5%
6 1
 
12.5%
. 1
 
12.5%
0 1
 
12.5%
1 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 3
37.5%
- 1
 
12.5%
6 1
 
12.5%
. 1
 
12.5%
0 1
 
12.5%
1 1
 
12.5%

fieldNumber
Text

Missing 

Distinct3091
Distinct (%)72.7%
Missing600377
Missing (%)99.3%
Memory size4.6 MiB
2025-01-08T17:48:01.531049image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length29
Mean length9.591433278
Min length1

Characters and Unicode

Total characters40754
Distinct characters70
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2646 ?
Unique (%)62.3%

Sample

1st rowBBB991
2nd rowBBB642-DERM
3rd row1653
4th rowJSL021109-18
5th rowCOL-8-101
ValueCountFrequency (%)
1653 128
 
2.8%
2 46
 
1.0%
bbb899-hym 34
 
0.7%
1 32
 
0.7%
bbb791-hym 25
 
0.5%
bbb749-hym 23
 
0.5%
759-8 22
 
0.5%
tub 20
 
0.4%
tank 18
 
0.4%
9 18
 
0.4%
Other values (3087) 4225
92.0%
2025-01-08T17:48:01.765673image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
B 4781
 
11.7%
0 3995
 
9.8%
- 3976
 
9.8%
1 3398
 
8.3%
2 2238
 
5.5%
3 1558
 
3.8%
6 1541
 
3.8%
7 1509
 
3.7%
4 1498
 
3.7%
9 1481
 
3.6%
Other values (60) 14779
36.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 19488
47.8%
Uppercase Letter 15048
36.9%
Dash Punctuation 3976
 
9.8%
Lowercase Letter 1242
 
3.0%
Other Punctuation 654
 
1.6%
Space Separator 342
 
0.8%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
B 4781
31.8%
S 1388
 
9.2%
T 1136
 
7.5%
C 792
 
5.3%
M 763
 
5.1%
A 707
 
4.7%
L 667
 
4.4%
R 639
 
4.2%
N 583
 
3.9%
H 532
 
3.5%
Other values (15) 3060
20.3%
Lowercase Letter
ValueCountFrequency (%)
e 146
11.8%
a 138
11.1%
o 134
10.8%
t 118
 
9.5%
b 82
 
6.6%
n 81
 
6.5%
r 67
 
5.4%
m 57
 
4.6%
c 57
 
4.6%
i 55
 
4.4%
Other values (13) 307
24.7%
Decimal Number
ValueCountFrequency (%)
0 3995
20.5%
1 3398
17.4%
2 2238
11.5%
3 1558
 
8.0%
6 1541
 
7.9%
7 1509
 
7.7%
4 1498
 
7.7%
9 1481
 
7.6%
5 1174
 
6.0%
8 1096
 
5.6%
Other Punctuation
ValueCountFrequency (%)
# 344
52.6%
. 199
30.4%
; 93
 
14.2%
, 10
 
1.5%
' 3
 
0.5%
" 3
 
0.5%
/ 1
 
0.2%
: 1
 
0.2%
Dash Punctuation
ValueCountFrequency (%)
- 3976
100.0%
Space Separator
ValueCountFrequency (%)
342
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 24464
60.0%
Latin 16290
40.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
B 4781
29.3%
S 1388
 
8.5%
T 1136
 
7.0%
C 792
 
4.9%
M 763
 
4.7%
A 707
 
4.3%
L 667
 
4.1%
R 639
 
3.9%
N 583
 
3.6%
H 532
 
3.3%
Other values (38) 4302
26.4%
Common
ValueCountFrequency (%)
0 3995
16.3%
- 3976
16.3%
1 3398
13.9%
2 2238
9.1%
3 1558
 
6.4%
6 1541
 
6.3%
7 1509
 
6.2%
4 1498
 
6.1%
9 1481
 
6.1%
5 1174
 
4.8%
Other values (12) 2096
8.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 40754
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
B 4781
 
11.7%
0 3995
 
9.8%
- 3976
 
9.8%
1 3398
 
8.3%
2 2238
 
5.5%
3 1558
 
3.8%
6 1541
 
3.8%
7 1509
 
3.7%
4 1498
 
3.7%
9 1481
 
3.6%
Other values (60) 14779
36.3%

eventDate
Text

Missing 

Distinct45561
Distinct (%)12.5%
Missing239769
Missing (%)39.7%
Memory size4.6 MiB
2025-01-08T17:48:01.953761image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length10
Mean length10.99102388
Min length4

Characters and Unicode

Total characters4010152
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12880 ?
Unique (%)3.5%

Sample

1st row1967-06-20
2nd row1914-07
3rd row2005-08-02
4th row1964-04-25
5th row1971-08-22
ValueCountFrequency (%)
1998-07-26 709
 
0.2%
1938 599
 
0.2%
1896 545
 
0.1%
2006-06-24 544
 
0.1%
1933 543
 
0.1%
1960-06-30 506
 
0.1%
1930 495
 
0.1%
1936 490
 
0.1%
1927-07-10 469
 
0.1%
1964-08-01/1964-08-31 449
 
0.1%
Other values (45551) 359508
98.5%
2025-01-08T17:48:02.307414image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 776958
19.4%
1 696249
17.4%
0 648003
16.2%
9 488959
12.2%
2 286183
 
7.1%
6 224282
 
5.6%
7 215676
 
5.4%
8 182043
 
4.5%
5 158527
 
4.0%
3 154598
 
3.9%
Other values (2) 178674
 
4.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3189224
79.5%
Dash Punctuation 776958
 
19.4%
Other Punctuation 43970
 
1.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 696249
21.8%
0 648003
20.3%
9 488959
15.3%
2 286183
9.0%
6 224282
 
7.0%
7 215676
 
6.8%
8 182043
 
5.7%
5 158527
 
5.0%
3 154598
 
4.8%
4 134704
 
4.2%
Dash Punctuation
ValueCountFrequency (%)
- 776958
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 43970
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4010152
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 776958
19.4%
1 696249
17.4%
0 648003
16.2%
9 488959
12.2%
2 286183
 
7.1%
6 224282
 
5.6%
7 215676
 
5.4%
8 182043
 
4.5%
5 158527
 
4.0%
3 154598
 
3.9%
Other values (2) 178674
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4010152
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 776958
19.4%
1 696249
17.4%
0 648003
16.2%
9 488959
12.2%
2 286183
 
7.1%
6 224282
 
5.6%
7 215676
 
5.4%
8 182043
 
4.5%
5 158527
 
4.0%
3 154598
 
3.9%
Other values (2) 178674
 
4.5%

startDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing270965
Missing (%)44.8%
Memory size4.6 MiB
2025-01-08T17:48:02.503618image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.85075271
Min length1

Characters and Unicode

Total characters951185
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row171
2nd row214
3rd row116
4th row234
5th row157
ValueCountFrequency (%)
182 3298
 
1.0%
183 2901
 
0.9%
191 2876
 
0.9%
207 2734
 
0.8%
213 2713
 
0.8%
178 2623
 
0.8%
214 2602
 
0.8%
172 2574
 
0.8%
189 2556
 
0.8%
218 2541
 
0.8%
Other values (356) 306243
91.8%
2025-01-08T17:48:02.756178image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 221414
23.3%
2 186945
19.7%
3 89297
9.4%
9 66624
 
7.0%
8 65748
 
6.9%
0 65602
 
6.9%
6 64712
 
6.8%
7 63778
 
6.7%
5 63690
 
6.7%
4 63375
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 951185
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 221414
23.3%
2 186945
19.7%
3 89297
9.4%
9 66624
 
7.0%
8 65748
 
6.9%
0 65602
 
6.9%
6 64712
 
6.8%
7 63778
 
6.7%
5 63690
 
6.7%
4 63375
 
6.7%

Most occurring scripts

ValueCountFrequency (%)
Common 951185
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 221414
23.3%
2 186945
19.7%
3 89297
9.4%
9 66624
 
7.0%
8 65748
 
6.9%
0 65602
 
6.9%
6 64712
 
6.8%
7 63778
 
6.7%
5 63690
 
6.7%
4 63375
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 951185
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 221414
23.3%
2 186945
19.7%
3 89297
9.4%
9 66624
 
7.0%
8 65748
 
6.9%
0 65602
 
6.9%
6 64712
 
6.8%
7 63778
 
6.7%
5 63690
 
6.7%
4 63375
 
6.7%

endDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing270965
Missing (%)44.8%
Memory size4.6 MiB
2025-01-08T17:48:02.952517image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.860076545
Min length1

Characters and Unicode

Total characters954296
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row171
2nd row214
3rd row116
4th row234
5th row157
ValueCountFrequency (%)
207 2989
 
0.9%
191 2948
 
0.9%
197 2758
 
0.8%
212 2710
 
0.8%
182 2684
 
0.8%
178 2598
 
0.8%
181 2581
 
0.8%
196 2566
 
0.8%
172 2491
 
0.7%
208 2488
 
0.7%
Other values (356) 306848
92.0%
2025-01-08T17:48:03.200215image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 220893
23.1%
2 187359
19.6%
3 90241
9.5%
9 67299
 
7.1%
0 66539
 
7.0%
7 65376
 
6.9%
8 65012
 
6.8%
6 64562
 
6.8%
5 63697
 
6.7%
4 63318
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 954296
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 220893
23.1%
2 187359
19.6%
3 90241
9.5%
9 67299
 
7.1%
0 66539
 
7.0%
7 65376
 
6.9%
8 65012
 
6.8%
6 64562
 
6.8%
5 63697
 
6.7%
4 63318
 
6.6%

Most occurring scripts

ValueCountFrequency (%)
Common 954296
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 220893
23.1%
2 187359
19.6%
3 90241
9.5%
9 67299
 
7.1%
0 66539
 
7.0%
7 65376
 
6.9%
8 65012
 
6.8%
6 64562
 
6.8%
5 63697
 
6.7%
4 63318
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 954296
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 220893
23.1%
2 187359
19.6%
3 90241
9.5%
9 67299
 
7.1%
0 66539
 
7.0%
7 65376
 
6.9%
8 65012
 
6.8%
6 64562
 
6.8%
5 63697
 
6.7%
4 63318
 
6.6%

year
Text

Missing 

Distinct190
Distinct (%)0.1%
Missing240229
Missing (%)39.7%
Memory size4.6 MiB
2025-01-08T17:48:03.356641image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1457588
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17 ?
Unique (%)< 0.1%

Sample

1st row1967
2nd row1914
3rd row2005
4th row1964
5th row1971
ValueCountFrequency (%)
1966 12303
 
3.4%
1968 9189
 
2.5%
1971 8968
 
2.5%
1967 8355
 
2.3%
1965 7870
 
2.2%
1972 6272
 
1.7%
1964 6145
 
1.7%
1974 6095
 
1.7%
1973 6077
 
1.7%
1963 5552
 
1.5%
Other values (180) 287571
78.9%
2025-01-08T17:48:03.564873image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 397082
27.2%
9 381004
26.1%
6 108661
 
7.5%
0 107799
 
7.4%
2 92602
 
6.4%
7 89152
 
6.1%
8 74474
 
5.1%
5 72350
 
5.0%
3 69496
 
4.8%
4 64968
 
4.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1457588
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 397082
27.2%
9 381004
26.1%
6 108661
 
7.5%
0 107799
 
7.4%
2 92602
 
6.4%
7 89152
 
6.1%
8 74474
 
5.1%
5 72350
 
5.0%
3 69496
 
4.8%
4 64968
 
4.5%

Most occurring scripts

ValueCountFrequency (%)
Common 1457588
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 397082
27.2%
9 381004
26.1%
6 108661
 
7.5%
0 107799
 
7.4%
2 92602
 
6.4%
7 89152
 
6.1%
8 74474
 
5.1%
5 72350
 
5.0%
3 69496
 
4.8%
4 64968
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1457588
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 397082
27.2%
9 381004
26.1%
6 108661
 
7.5%
0 107799
 
7.4%
2 92602
 
6.4%
7 89152
 
6.1%
8 74474
 
5.1%
5 72350
 
5.0%
3 69496
 
4.8%
4 64968
 
4.5%

month
Text

Missing 

Distinct12
Distinct (%)< 0.1%
Missing254573
Missing (%)42.1%
Memory size4.6 MiB
2025-01-08T17:48:03.622125image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length1
Mean length1.112948611
Min length1

Characters and Unicode

Total characters389591
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row6
2nd row7
3rd row8
4th row4
5th row8
ValueCountFrequency (%)
7 73156
20.9%
6 58086
16.6%
8 51402
14.7%
5 35620
10.2%
9 25573
 
7.3%
4 24539
 
7.0%
3 16420
 
4.7%
10 16139
 
4.6%
2 13949
 
4.0%
11 13286
 
3.8%
Other values (2) 21883
 
6.3%
2025-01-08T17:48:03.718372image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 73156
18.8%
1 64594
16.6%
6 58086
14.9%
8 51402
13.2%
5 35620
9.1%
9 25573
 
6.6%
4 24539
 
6.3%
2 24062
 
6.2%
3 16420
 
4.2%
0 16139
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 389591
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 73156
18.8%
1 64594
16.6%
6 58086
14.9%
8 51402
13.2%
5 35620
9.1%
9 25573
 
6.6%
4 24539
 
6.3%
2 24062
 
6.2%
3 16420
 
4.2%
0 16139
 
4.1%

Most occurring scripts

ValueCountFrequency (%)
Common 389591
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
7 73156
18.8%
1 64594
16.6%
6 58086
14.9%
8 51402
13.2%
5 35620
9.1%
9 25573
 
6.6%
4 24539
 
6.3%
2 24062
 
6.2%
3 16420
 
4.2%
0 16139
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 389591
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 73156
18.8%
1 64594
16.6%
6 58086
14.9%
8 51402
13.2%
5 35620
9.1%
9 25573
 
6.6%
4 24539
 
6.3%
2 24062
 
6.2%
3 16420
 
4.2%
0 16139
 
4.1%

day
Text

Missing 

Distinct31
Distinct (%)< 0.1%
Missing314935
Missing (%)52.1%
Memory size4.6 MiB
2025-01-08T17:48:03.785582image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length1.709435226
Min length1

Characters and Unicode

Total characters495208
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20
2nd row2
3rd row25
4th row22
5th row6
ValueCountFrequency (%)
8 11096
 
3.8%
20 10742
 
3.7%
10 10614
 
3.7%
1 10586
 
3.7%
12 10579
 
3.7%
15 10517
 
3.6%
26 9863
 
3.4%
25 9824
 
3.4%
16 9809
 
3.4%
14 9721
 
3.4%
Other values (21) 186340
64.3%
2025-01-08T17:48:03.903767image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 132241
26.7%
2 123511
24.9%
3 40350
 
8.1%
0 29762
 
6.0%
8 29425
 
5.9%
6 29321
 
5.9%
5 28696
 
5.8%
4 28235
 
5.7%
7 27405
 
5.5%
9 26262
 
5.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 495208
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 132241
26.7%
2 123511
24.9%
3 40350
 
8.1%
0 29762
 
6.0%
8 29425
 
5.9%
6 29321
 
5.9%
5 28696
 
5.8%
4 28235
 
5.7%
7 27405
 
5.5%
9 26262
 
5.3%

Most occurring scripts

ValueCountFrequency (%)
Common 495208
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 132241
26.7%
2 123511
24.9%
3 40350
 
8.1%
0 29762
 
6.0%
8 29425
 
5.9%
6 29321
 
5.9%
5 28696
 
5.8%
4 28235
 
5.7%
7 27405
 
5.5%
9 26262
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 495208
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 132241
26.7%
2 123511
24.9%
3 40350
 
8.1%
0 29762
 
6.0%
8 29425
 
5.9%
6 29321
 
5.9%
5 28696
 
5.8%
4 28235
 
5.7%
7 27405
 
5.5%
9 26262
 
5.3%

verbatimEventDate
Text

Missing 

Distinct67985
Distinct (%)32.6%
Missing396306
Missing (%)65.5%
Memory size4.6 MiB
2025-01-08T17:48:04.077872image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length79
Median length71
Mean length10.59670219
Min length1

Characters and Unicode

Total characters2207505
Distinct characters92
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique51573 ?
Unique (%)24.8%

Sample

1st row[Not Stated]
2nd row2-Aug-2005
3rd row[Not Stated]
4th row[Not Stated]
5th row9-IX-78
ValueCountFrequency (%)
not 32197
 
8.2%
stated 32165
 
8.2%
july 8706
 
2.2%
aug 7740
 
2.0%
june 7233
 
1.8%
may 5957
 
1.5%
1968 5763
 
1.5%
1971 5705
 
1.5%
1966 4507
 
1.1%
1972 2977
 
0.8%
Other values (37313) 279737
71.2%
2025-01-08T17:48:04.337474image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 217306
 
9.8%
184367
 
8.4%
9 146678
 
6.6%
- 127695
 
5.8%
2 112927
 
5.1%
t 105528
 
4.8%
I 88868
 
4.0%
6 79315
 
3.6%
0 76302
 
3.5%
. 64856
 
2.9%
Other values (82) 1003663
45.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 900761
40.8%
Lowercase Letter 464809
21.1%
Uppercase Letter 333358
 
15.1%
Space Separator 184367
 
8.4%
Other Punctuation 128788
 
5.8%
Dash Punctuation 127731
 
5.8%
Open Punctuation 33629
 
1.5%
Close Punctuation 33624
 
1.5%
Connector Punctuation 250
 
< 0.1%
Math Symbol 187
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 105528
22.7%
e 57947
12.5%
a 49161
10.6%
u 41265
 
8.9%
o 39662
 
8.5%
d 33264
 
7.2%
n 19822
 
4.3%
y 17900
 
3.9%
l 17062
 
3.7%
r 16875
 
3.6%
Other values (18) 66323
14.3%
Uppercase Letter
ValueCountFrequency (%)
I 88868
26.7%
V 43505
13.1%
N 38307
11.5%
S 36913
11.1%
J 33535
 
10.1%
A 23441
 
7.0%
M 13900
 
4.2%
X 9129
 
2.7%
U 7428
 
2.2%
E 5306
 
1.6%
Other values (17) 33026
 
9.9%
Other Punctuation
ValueCountFrequency (%)
. 64856
50.4%
, 34975
27.2%
/ 22999
 
17.9%
' 5024
 
3.9%
: 620
 
0.5%
? 141
 
0.1%
; 102
 
0.1%
& 38
 
< 0.1%
" 21
 
< 0.1%
# 6
 
< 0.1%
Other values (3) 6
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 217306
24.1%
9 146678
16.3%
2 112927
12.5%
6 79315
 
8.8%
0 76302
 
8.5%
7 63754
 
7.1%
3 54206
 
6.0%
8 53731
 
6.0%
5 48632
 
5.4%
4 47910
 
5.3%
Open Punctuation
ValueCountFrequency (%)
[ 33541
99.7%
( 82
 
0.2%
{ 6
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
] 33536
99.7%
) 82
 
0.2%
} 6
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
| 156
83.4%
+ 26
 
13.9%
= 5
 
2.7%
Dash Punctuation
ValueCountFrequency (%)
- 127695
> 99.9%
36
 
< 0.1%
Space Separator
ValueCountFrequency (%)
184367
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 250
100.0%
Modifier Symbol
ValueCountFrequency (%)
^ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1409338
63.8%
Latin 798167
36.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 105528
13.2%
I 88868
 
11.1%
e 57947
 
7.3%
a 49161
 
6.2%
V 43505
 
5.5%
u 41265
 
5.2%
o 39662
 
5.0%
N 38307
 
4.8%
S 36913
 
4.6%
J 33535
 
4.2%
Other values (45) 263476
33.0%
Common
ValueCountFrequency (%)
1 217306
15.4%
184367
13.1%
9 146678
10.4%
- 127695
9.1%
2 112927
8.0%
6 79315
 
5.6%
0 76302
 
5.4%
. 64856
 
4.6%
7 63754
 
4.5%
3 54206
 
3.8%
Other values (27) 281932
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2207464
> 99.9%
Punctuation 37
 
< 0.1%
None 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 217306
 
9.8%
184367
 
8.4%
9 146678
 
6.6%
- 127695
 
5.8%
2 112927
 
5.1%
t 105528
 
4.8%
I 88868
 
4.0%
6 79315
 
3.6%
0 76302
 
3.5%
. 64856
 
2.9%
Other values (77) 1003622
45.5%
Punctuation
ValueCountFrequency (%)
36
97.3%
1
 
2.7%
None
ValueCountFrequency (%)
û 2
50.0%
Ç 1
25.0%
ÿ 1
25.0%

habitat
Text

Missing 

Distinct89
Distinct (%)44.7%
Missing604427
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:04.509131image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length103
Median length43
Mean length19.28643216
Min length5

Characters and Unicode

Total characters3838
Distinct characters62
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique64 ?
Unique (%)32.2%

Sample

1st rowRoadside in coniferous forest
2nd rowOn a figleaf gourd
3rd rowcultivated garden
4th rowhammocks-dense hardwood & Palmetto forests
5th rowvisiting mango flowers
ValueCountFrequency (%)
garden 45
 
7.4%
cultivated 44
 
7.3%
stream 26
 
4.3%
on 26
 
4.3%
forest 23
 
3.8%
in 19
 
3.1%
of 13
 
2.1%
collected 12
 
2.0%
at 9
 
1.5%
terre 8
 
1.3%
Other values (183) 381
62.9%
2025-01-08T17:48:04.739825image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
407
 
10.6%
e 388
 
10.1%
a 308
 
8.0%
r 258
 
6.7%
t 250
 
6.5%
d 224
 
5.8%
n 223
 
5.8%
o 217
 
5.7%
i 190
 
5.0%
l 185
 
4.8%
Other values (52) 1188
31.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3215
83.8%
Space Separator 407
 
10.6%
Uppercase Letter 126
 
3.3%
Other Punctuation 51
 
1.3%
Decimal Number 27
 
0.7%
Dash Punctuation 6
 
0.2%
Close Punctuation 3
 
0.1%
Open Punctuation 3
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 388
12.1%
a 308
 
9.6%
r 258
 
8.0%
t 250
 
7.8%
d 224
 
7.0%
n 223
 
6.9%
o 217
 
6.7%
i 190
 
5.9%
l 185
 
5.8%
s 175
 
5.4%
Other values (15) 797
24.8%
Uppercase Letter
ValueCountFrequency (%)
S 28
22.2%
C 24
19.0%
R 9
 
7.1%
O 9
 
7.1%
P 8
 
6.3%
T 7
 
5.6%
I 6
 
4.8%
W 5
 
4.0%
F 5
 
4.0%
E 4
 
3.2%
Other values (10) 21
16.7%
Decimal Number
ValueCountFrequency (%)
0 8
29.6%
2 6
22.2%
1 5
18.5%
3 4
14.8%
8 2
 
7.4%
5 1
 
3.7%
7 1
 
3.7%
Other Punctuation
ValueCountFrequency (%)
, 19
37.3%
. 16
31.4%
" 6
 
11.8%
: 5
 
9.8%
& 3
 
5.9%
/ 2
 
3.9%
Space Separator
ValueCountFrequency (%)
407
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3341
87.1%
Common 497
 
12.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 388
11.6%
a 308
 
9.2%
r 258
 
7.7%
t 250
 
7.5%
d 224
 
6.7%
n 223
 
6.7%
o 217
 
6.5%
i 190
 
5.7%
l 185
 
5.5%
s 175
 
5.2%
Other values (35) 923
27.6%
Common
ValueCountFrequency (%)
407
81.9%
, 19
 
3.8%
. 16
 
3.2%
0 8
 
1.6%
" 6
 
1.2%
2 6
 
1.2%
- 6
 
1.2%
1 5
 
1.0%
: 5
 
1.0%
3 4
 
0.8%
Other values (7) 15
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3838
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
407
 
10.6%
e 388
 
10.1%
a 308
 
8.0%
r 258
 
6.7%
t 250
 
6.5%
d 224
 
5.8%
n 223
 
5.8%
o 217
 
5.7%
i 190
 
5.0%
l 185
 
4.8%
Other values (52) 1188
31.0%

locationID
Text

Missing 

Distinct185
Distinct (%)17.7%
Missing603581
Missing (%)99.8%
Memory size4.6 MiB
2025-01-08T17:48:04.909007image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length40
Median length14
Mean length10.78947368
Min length1

Characters and Unicode

Total characters11275
Distinct characters56
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique94 ?
Unique (%)9.0%

Sample

1st rowMEI Site 97-81
2nd rowRD-044
3rd rowMEI Site 97-81
4th rowMEI Site 97-81
5th rowMEI Site 97-81
ValueCountFrequency (%)
mei 652
27.5%
site 610
25.7%
97-81 301
12.7%
97-92 132
 
5.6%
97-90 52
 
2.2%
97-58 46
 
1.9%
97-74 31
 
1.3%
97-88 26
 
1.1%
97-93 24
 
1.0%
k-m1 19
 
0.8%
Other values (195) 479
20.2%
2025-01-08T17:48:05.129708image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1327
 
11.8%
- 986
 
8.7%
9 904
 
8.0%
7 770
 
6.8%
M 698
 
6.2%
I 659
 
5.8%
E 656
 
5.8%
t 638
 
5.7%
e 637
 
5.6%
i 624
 
5.5%
Other values (46) 3376
29.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3620
32.1%
Uppercase Letter 3287
29.2%
Lowercase Letter 2029
18.0%
Space Separator 1327
 
11.8%
Dash Punctuation 986
 
8.7%
Other Punctuation 26
 
0.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 698
21.2%
I 659
20.0%
E 656
20.0%
S 609
18.5%
R 278
 
8.5%
D 272
 
8.3%
K 20
 
0.6%
J 14
 
0.4%
N 11
 
0.3%
L 11
 
0.3%
Other values (11) 59
 
1.8%
Lowercase Letter
ValueCountFrequency (%)
t 638
31.4%
e 637
31.4%
i 624
30.8%
l 27
 
1.3%
a 20
 
1.0%
s 20
 
1.0%
r 10
 
0.5%
o 8
 
0.4%
n 7
 
0.3%
p 7
 
0.3%
Other values (9) 31
 
1.5%
Decimal Number
ValueCountFrequency (%)
9 904
25.0%
7 770
21.3%
1 571
15.8%
8 458
12.7%
2 322
 
8.9%
0 184
 
5.1%
5 143
 
4.0%
4 95
 
2.6%
6 87
 
2.4%
3 86
 
2.4%
Other Punctuation
ValueCountFrequency (%)
# 19
73.1%
, 5
 
19.2%
. 1
 
3.8%
: 1
 
3.8%
Space Separator
ValueCountFrequency (%)
1327
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 986
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5959
52.9%
Latin 5316
47.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 698
13.1%
I 659
12.4%
E 656
12.3%
t 638
12.0%
e 637
12.0%
i 624
11.7%
S 609
11.5%
R 278
 
5.2%
D 272
 
5.1%
l 27
 
0.5%
Other values (30) 218
 
4.1%
Common
ValueCountFrequency (%)
1327
22.3%
- 986
16.5%
9 904
15.2%
7 770
12.9%
1 571
9.6%
8 458
 
7.7%
2 322
 
5.4%
0 184
 
3.1%
5 143
 
2.4%
4 95
 
1.6%
Other values (6) 199
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11275
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1327
 
11.8%
- 986
 
8.7%
9 904
 
8.0%
7 770
 
6.8%
M 698
 
6.2%
I 659
 
5.8%
E 656
 
5.8%
t 638
 
5.7%
e 637
 
5.6%
i 624
 
5.5%
Other values (46) 3376
29.9%

higherGeography
Text

Missing 

Distinct10596
Distinct (%)2.4%
Missing156072
Missing (%)25.8%
Memory size4.6 MiB
2025-01-08T17:48:05.297306image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length101
Median length91
Mean length30.3893578
Min length4

Characters and Unicode

Total characters13631268
Distinct characters132
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3142 ?
Unique (%)0.7%

Sample

1st rowUnited States, [Not Stated], [Not Stated]
2nd rowCosta Rica, Cartago, [Not Stated]
3rd rowUnited States, Alaska, Aleutians West
4th rowUnited States, Virginia, Virginia Beach
5th rowUnited States, New York, [Not Stated]
ValueCountFrequency (%)
united 222825
 
12.1%
states 221093
 
12.1%
not 167986
 
9.2%
stated 167984
 
9.2%
california 23408
 
1.3%
virginia 23318
 
1.3%
new 22501
 
1.2%
colorado 21080
 
1.1%
mexico 21000
 
1.1%
canada 16228
 
0.9%
Other values (6796) 927046
50.5%
2025-01-08T17:48:05.539331image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1386625
 
10.2%
t 1386616
 
10.2%
1385915
 
10.2%
e 1090858
 
8.0%
i 815973
 
6.0%
n 814117
 
6.0%
, 798806
 
5.9%
o 692454
 
5.1%
d 580356
 
4.3%
s 501626
 
3.7%
Other values (122) 4177922
30.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9267024
68.0%
Uppercase Letter 1826318
 
13.4%
Space Separator 1385915
 
10.2%
Other Punctuation 805648
 
5.9%
Open Punctuation 168013
 
1.2%
Close Punctuation 167964
 
1.2%
Dash Punctuation 10307
 
0.1%
Decimal Number 75
 
< 0.1%
Modifier Letter 2
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1386625
15.0%
t 1386616
15.0%
e 1090858
11.8%
i 815973
8.8%
n 814117
8.8%
o 692454
7.5%
d 580356
6.3%
s 501626
 
5.4%
r 454233
 
4.9%
l 313851
 
3.4%
Other values (59) 1230315
13.3%
Uppercase Letter
ValueCountFrequency (%)
S 462242
25.3%
U 242069
13.3%
N 220708
12.1%
C 174669
 
9.6%
M 92421
 
5.1%
P 64234
 
3.5%
B 57594
 
3.2%
A 54173
 
3.0%
T 52081
 
2.9%
I 45075
 
2.5%
Other values (27) 361052
19.8%
Other Punctuation
ValueCountFrequency (%)
, 798806
99.2%
' 3983
 
0.5%
. 2433
 
0.3%
/ 183
 
< 0.1%
? 152
 
< 0.1%
& 50
 
< 0.1%
: 39
 
< 0.1%
; 1
 
< 0.1%
¡ 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
3 46
61.3%
9 14
 
18.7%
4 11
 
14.7%
2 2
 
2.7%
8 1
 
1.3%
1 1
 
1.3%
Dash Punctuation
ValueCountFrequency (%)
- 10283
99.8%
22
 
0.2%
2
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
[ 167979
> 99.9%
( 34
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
] 167930
> 99.9%
) 34
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1385915
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 2
100.0%
Math Symbol
ValueCountFrequency (%)
= 1
100.0%
Control
ValueCountFrequency (%)
 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11093342
81.4%
Common 2537926
 
18.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1386625
12.5%
t 1386616
12.5%
e 1090858
 
9.8%
i 815973
 
7.4%
n 814117
 
7.3%
o 692454
 
6.2%
d 580356
 
5.2%
s 501626
 
4.5%
S 462242
 
4.2%
r 454233
 
4.1%
Other values (96) 2908242
26.2%
Common
ValueCountFrequency (%)
1385915
54.6%
, 798806
31.5%
[ 167979
 
6.6%
] 167930
 
6.6%
- 10283
 
0.4%
' 3983
 
0.2%
. 2433
 
0.1%
/ 183
 
< 0.1%
? 152
 
< 0.1%
& 50
 
< 0.1%
Other values (16) 212
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13624976
> 99.9%
None 6244
 
< 0.1%
Punctuation 24
 
< 0.1%
Latin Ext Additional 22
 
< 0.1%
Modifier Letters 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1386625
 
10.2%
t 1386616
 
10.2%
1385915
 
10.2%
e 1090858
 
8.0%
i 815973
 
6.0%
n 814117
 
6.0%
, 798806
 
5.9%
o 692454
 
5.1%
d 580356
 
4.3%
s 501626
 
3.7%
Other values (63) 4171630
30.6%
None
ValueCountFrequency (%)
á 1227
19.7%
ü 1113
17.8%
í 1027
16.4%
ó 731
11.7%
é 700
11.2%
ã 292
 
4.7%
ô 268
 
4.3%
ø 167
 
2.7%
è 135
 
2.2%
ä 68
 
1.1%
Other values (45) 516
8.3%
Punctuation
ValueCountFrequency (%)
22
91.7%
2
 
8.3%
Latin Ext Additional
ValueCountFrequency (%)
22
100.0%
Modifier Letters
ValueCountFrequency (%)
ʻ 2
100.0%

continent
Text

Missing 

Distinct7
Distinct (%)< 0.1%
Missing199137
Missing (%)32.9%
Memory size4.6 MiB
2025-01-08T17:48:05.598333image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length11.12657803
Min length4

Characters and Unicode

Total characters4511705
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNORTH_AMERICA
2nd rowNORTH_AMERICA
3rd rowNORTH_AMERICA
4th rowNORTH_AMERICA
5th rowNORTH_AMERICA
ValueCountFrequency (%)
north_america 259896
64.1%
asia 50862
 
12.5%
south_america 49534
 
12.2%
africa 21692
 
5.3%
oceania 14473
 
3.6%
europe 9029
 
2.2%
antarctica 3
 
< 0.1%
2025-01-08T17:48:05.696538image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 792923
17.6%
R 600050
13.3%
I 396460
8.8%
C 345601
7.7%
E 341961
7.6%
O 332932
7.4%
T 309436
 
6.9%
H 309430
 
6.9%
_ 309430
 
6.9%
M 309430
 
6.9%
Other values (5) 464052
10.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4202275
93.1%
Connector Punctuation 309430
 
6.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 792923
18.9%
R 600050
14.3%
I 396460
9.4%
C 345601
8.2%
E 341961
8.1%
O 332932
7.9%
T 309436
 
7.4%
H 309430
 
7.4%
M 309430
 
7.4%
N 274372
 
6.5%
Other values (4) 189680
 
4.5%
Connector Punctuation
ValueCountFrequency (%)
_ 309430
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4202275
93.1%
Common 309430
 
6.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 792923
18.9%
R 600050
14.3%
I 396460
9.4%
C 345601
8.2%
E 341961
8.1%
O 332932
7.9%
T 309436
 
7.4%
H 309430
 
7.4%
M 309430
 
7.4%
N 274372
 
6.5%
Other values (4) 189680
 
4.5%
Common
ValueCountFrequency (%)
_ 309430
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4511705
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 792923
17.6%
R 600050
13.3%
I 396460
8.8%
C 345601
7.7%
E 341961
7.6%
O 332932
7.4%
T 309436
 
6.9%
H 309430
 
6.9%
_ 309430
 
6.9%
M 309430
 
6.9%
Other values (5) 464052
10.3%

islandGroup
Text

Missing 

Distinct72
Distinct (%)2.9%
Missing602107
Missing (%)99.6%
Memory size4.6 MiB
2025-01-08T17:48:05.767066image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length13
Mean length13.72052402
Min length5

Characters and Unicode

Total characters34562
Distinct characters49
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)0.8%

Sample

1st rowSunda Islands
2nd rowInner Islands
3rd rowViti Levu Group
4th rowChuuk Lagoon
5th rowSunda Islands
ValueCountFrequency (%)
islands 2159
42.2%
sunda 955
18.7%
marquesas 249
 
4.9%
solomon 226
 
4.4%
bass 171
 
3.3%
chuuk 149
 
2.9%
lagoon 149
 
2.9%
outer 149
 
2.9%
inner 140
 
2.7%
group 100
 
2.0%
Other values (78) 673
 
13.1%
2025-01-08T17:48:05.900010image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 5363
15.5%
a 4393
12.7%
n 3946
11.4%
d 3264
9.4%
2601
7.5%
l 2567
7.4%
I 2312
6.7%
u 1952
 
5.6%
S 1249
 
3.6%
o 1226
 
3.5%
Other values (39) 5689
16.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 26822
77.6%
Uppercase Letter 5120
 
14.8%
Space Separator 2601
 
7.5%
Other Punctuation 19
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 5363
20.0%
a 4393
16.4%
n 3946
14.7%
d 3264
12.2%
l 2567
9.6%
u 1952
 
7.3%
o 1226
 
4.6%
r 905
 
3.4%
e 893
 
3.3%
i 343
 
1.3%
Other values (14) 1970
 
7.3%
Uppercase Letter
ValueCountFrequency (%)
I 2312
45.2%
S 1249
24.4%
M 256
 
5.0%
L 237
 
4.6%
C 200
 
3.9%
B 171
 
3.3%
O 158
 
3.1%
G 147
 
2.9%
V 87
 
1.7%
N 75
 
1.5%
Other values (12) 228
 
4.5%
Other Punctuation
ValueCountFrequency (%)
' 10
52.6%
. 9
47.4%
Space Separator
ValueCountFrequency (%)
2601
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 31942
92.4%
Common 2620
 
7.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 5363
16.8%
a 4393
13.8%
n 3946
12.4%
d 3264
10.2%
l 2567
8.0%
I 2312
7.2%
u 1952
 
6.1%
S 1249
 
3.9%
o 1226
 
3.8%
r 905
 
2.8%
Other values (36) 4765
14.9%
Common
ValueCountFrequency (%)
2601
99.3%
' 10
 
0.4%
. 9
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 34562
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 5363
15.5%
a 4393
12.7%
n 3946
11.4%
d 3264
9.4%
2601
7.5%
l 2567
7.4%
I 2312
6.7%
u 1952
 
5.6%
S 1249
 
3.6%
o 1226
 
3.5%
Other values (39) 5689
16.5%

island
Text

Missing 

Distinct436
Distinct (%)4.7%
Missing595261
Missing (%)98.5%
Memory size4.6 MiB
2025-01-08T17:48:06.064712image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length21
Mean length9.325680726
Min length3

Characters and Unicode

Total characters87335
Distinct characters62
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique168 ?
Unique (%)1.8%

Sample

1st rowSouth Island
2nd rowPohnpei
3rd rowSouth Island
4th rowOahu
5th rowGuadalcanal
ValueCountFrequency (%)
island 3167
21.5%
south 1636
 
11.1%
java 883
 
6.0%
levu 565
 
3.8%
viti 541
 
3.7%
north 519
 
3.5%
guadalcanal 327
 
2.2%
borneo 253
 
1.7%
hiva 247
 
1.7%
key 246
 
1.7%
Other values (438) 6371
43.2%
2025-01-08T17:48:06.296369image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 12931
14.8%
n 6143
 
7.0%
l 5485
 
6.3%
o 5446
 
6.2%
5390
 
6.2%
u 4466
 
5.1%
d 4450
 
5.1%
s 4126
 
4.7%
e 3908
 
4.5%
t 3745
 
4.3%
Other values (52) 31245
35.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 66993
76.7%
Uppercase Letter 14738
 
16.9%
Space Separator 5390
 
6.2%
Other Punctuation 169
 
0.2%
Dash Punctuation 18
 
< 0.1%
Open Punctuation 13
 
< 0.1%
Close Punctuation 13
 
< 0.1%
Modifier Letter 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 12931
19.3%
n 6143
9.2%
l 5485
8.2%
o 5446
8.1%
u 4466
 
6.7%
d 4450
 
6.6%
s 4126
 
6.2%
e 3908
 
5.8%
t 3745
 
5.6%
i 3651
 
5.4%
Other values (19) 12642
18.9%
Uppercase Letter
ValueCountFrequency (%)
I 3295
22.4%
S 2358
16.0%
N 1067
 
7.2%
J 891
 
6.0%
L 820
 
5.6%
B 722
 
4.9%
V 681
 
4.6%
G 648
 
4.4%
M 648
 
4.4%
H 619
 
4.2%
Other values (14) 2989
20.3%
Other Punctuation
ValueCountFrequency (%)
' 164
97.0%
. 5
 
3.0%
Open Punctuation
ValueCountFrequency (%)
( 12
92.3%
[ 1
 
7.7%
Close Punctuation
ValueCountFrequency (%)
) 12
92.3%
] 1
 
7.7%
Space Separator
ValueCountFrequency (%)
5390
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 81731
93.6%
Common 5604
 
6.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 12931
15.8%
n 6143
 
7.5%
l 5485
 
6.7%
o 5446
 
6.7%
u 4466
 
5.5%
d 4450
 
5.4%
s 4126
 
5.0%
e 3908
 
4.8%
t 3745
 
4.6%
i 3651
 
4.5%
Other values (43) 27380
33.5%
Common
ValueCountFrequency (%)
5390
96.2%
' 164
 
2.9%
- 18
 
0.3%
( 12
 
0.2%
) 12
 
0.2%
. 5
 
0.1%
ʻ 1
 
< 0.1%
[ 1
 
< 0.1%
] 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 87309
> 99.9%
None 25
 
< 0.1%
Modifier Letters 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 12931
14.8%
n 6143
 
7.0%
l 5485
 
6.3%
o 5446
 
6.2%
5390
 
6.2%
u 4466
 
5.1%
d 4450
 
5.1%
s 4126
 
4.7%
e 3908
 
4.5%
t 3745
 
4.3%
Other values (47) 31219
35.8%
None
ValueCountFrequency (%)
ñ 13
52.0%
ó 7
28.0%
é 4
 
16.0%
Ž 1
 
4.0%
Modifier Letters
ValueCountFrequency (%)
ʻ 1
100.0%

countryCode
Text

Missing 

Distinct217
Distinct (%)< 0.1%
Missing163440
Missing (%)27.0%
Memory size4.6 MiB
2025-01-08T17:48:06.453120image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters882372
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)< 0.1%

Sample

1st rowUS
2nd rowCR
3rd rowUS
4th rowUS
5th rowUS
ValueCountFrequency (%)
us 217888
49.4%
ca 16227
 
3.7%
mx 15807
 
3.6%
cn 14551
 
3.3%
br 12970
 
2.9%
cr 8902
 
2.0%
pe 7635
 
1.7%
in 7034
 
1.6%
ph 6836
 
1.5%
pa 6325
 
1.4%
Other values (207) 127011
28.8%
2025-01-08T17:48:06.654874image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 226074
25.6%
S 225013
25.5%
C 60558
 
6.9%
A 35637
 
4.0%
P 33561
 
3.8%
R 32631
 
3.7%
N 30863
 
3.5%
M 28480
 
3.2%
E 27423
 
3.1%
B 22245
 
2.5%
Other values (16) 159887
18.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 882372
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 226074
25.6%
S 225013
25.5%
C 60558
 
6.9%
A 35637
 
4.0%
P 33561
 
3.8%
R 32631
 
3.7%
N 30863
 
3.5%
M 28480
 
3.2%
E 27423
 
3.1%
B 22245
 
2.5%
Other values (16) 159887
18.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 882372
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 226074
25.6%
S 225013
25.5%
C 60558
 
6.9%
A 35637
 
4.0%
P 33561
 
3.8%
R 32631
 
3.7%
N 30863
 
3.5%
M 28480
 
3.2%
E 27423
 
3.1%
B 22245
 
2.5%
Other values (16) 159887
18.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 882372
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 226074
25.6%
S 225013
25.5%
C 60558
 
6.9%
A 35637
 
4.0%
P 33561
 
3.8%
R 32631
 
3.7%
N 30863
 
3.5%
M 28480
 
3.2%
E 27423
 
3.1%
B 22245
 
2.5%
Other values (16) 159887
18.1%

stateProvince
Text

Missing 

Distinct3068
Distinct (%)0.7%
Missing173217
Missing (%)28.6%
Memory size4.6 MiB
2025-01-08T17:48:06.827123image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length57
Median length44
Mean length9.044864618
Min length2

Characters and Unicode

Total characters3902036
Distinct characters116
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique808 ?
Unique (%)0.2%

Sample

1st row[Not Stated]
2nd rowCartago
3rd rowAlaska
4th rowVirginia
5th rowNew York
ValueCountFrequency (%)
not 29432
 
5.2%
stated 29432
 
5.2%
california 23319
 
4.1%
virginia 22011
 
3.9%
colorado 20952
 
3.7%
new 16649
 
2.9%
texas 12340
 
2.2%
arizona 12144
 
2.1%
florida 9882
 
1.7%
maryland 9606
 
1.7%
Other values (2915) 379808
67.2%
2025-01-08T17:48:07.067616image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 524269
 
13.4%
o 333137
 
8.5%
i 321738
 
8.2%
n 299056
 
7.7%
r 250043
 
6.4%
e 216664
 
5.6%
t 208608
 
5.3%
s 151897
 
3.9%
l 138272
 
3.5%
134166
 
3.4%
Other values (106) 1324186
33.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3135181
80.3%
Uppercase Letter 563753
 
14.4%
Space Separator 134166
 
3.4%
Open Punctuation 29401
 
0.8%
Close Punctuation 29392
 
0.8%
Dash Punctuation 8108
 
0.2%
Other Punctuation 1958
 
0.1%
Decimal Number 75
 
< 0.1%
Modifier Letter 1
 
< 0.1%
Control 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 524269
16.7%
o 333137
10.6%
i 321738
10.3%
n 299056
9.5%
r 250043
8.0%
e 216664
 
6.9%
t 208608
 
6.7%
s 151897
 
4.8%
l 138272
 
4.4%
d 112993
 
3.6%
Other values (49) 578504
18.5%
Uppercase Letter
ValueCountFrequency (%)
C 79646
14.1%
N 67131
11.9%
S 61370
10.9%
M 46167
 
8.2%
T 31296
 
5.6%
A 30285
 
5.4%
V 29054
 
5.2%
W 27111
 
4.8%
P 20324
 
3.6%
I 18301
 
3.2%
Other values (25) 153068
27.2%
Other Punctuation
ValueCountFrequency (%)
. 987
50.4%
' 638
32.6%
? 138
 
7.0%
/ 121
 
6.2%
, 70
 
3.6%
: 3
 
0.2%
¡ 1
 
0.1%
Decimal Number
ValueCountFrequency (%)
3 46
61.3%
9 14
 
18.7%
4 11
 
14.7%
2 2
 
2.7%
8 1
 
1.3%
1 1
 
1.3%
Open Punctuation
ValueCountFrequency (%)
[ 29400
> 99.9%
( 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
] 29391
> 99.9%
) 1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 8086
99.7%
22
 
0.3%
Space Separator
ValueCountFrequency (%)
134166
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 1
100.0%
Control
ValueCountFrequency (%)
 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3698934
94.8%
Common 203102
 
5.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 524269
14.2%
o 333137
 
9.0%
i 321738
 
8.7%
n 299056
 
8.1%
r 250043
 
6.8%
e 216664
 
5.9%
t 208608
 
5.6%
s 151897
 
4.1%
l 138272
 
3.7%
d 112993
 
3.1%
Other values (84) 1142257
30.9%
Common
ValueCountFrequency (%)
134166
66.1%
[ 29400
 
14.5%
] 29391
 
14.5%
- 8086
 
4.0%
. 987
 
0.5%
' 638
 
0.3%
? 138
 
0.1%
/ 121
 
0.1%
, 70
 
< 0.1%
3 46
 
< 0.1%
Other values (12) 59
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3896893
99.9%
None 5098
 
0.1%
Latin Ext Additional 22
 
< 0.1%
Punctuation 22
 
< 0.1%
Modifier Letters 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 524269
 
13.5%
o 333137
 
8.5%
i 321738
 
8.3%
n 299056
 
7.7%
r 250043
 
6.4%
e 216664
 
5.6%
t 208608
 
5.4%
s 151897
 
3.9%
l 138272
 
3.5%
134166
 
3.4%
Other values (60) 1319043
33.8%
None
ValueCountFrequency (%)
á 1200
23.5%
ü 990
19.4%
í 928
18.2%
ó 488
9.6%
é 410
 
8.0%
ã 292
 
5.7%
ø 158
 
3.1%
ô 125
 
2.5%
è 117
 
2.3%
ä 54
 
1.1%
Other values (33) 336
 
6.6%
Latin Ext Additional
ValueCountFrequency (%)
22
100.0%
Punctuation
ValueCountFrequency (%)
22
100.0%
Modifier Letters
ValueCountFrequency (%)
ʻ 1
100.0%

county
Text

Missing 

Distinct4068
Distinct (%)1.2%
Missing254826
Missing (%)42.1%
Memory size4.6 MiB
2025-01-08T17:48:07.247999image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length51
Median length45
Mean length9.456223556
Min length1

Characters and Unicode

Total characters3307787
Distinct characters98
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1157 ?
Unique (%)0.3%

Sample

1st row[Not Stated]
2nd row[Not Stated]
3rd rowAleutians West
4th rowVirginia Beach
5th row[Not Stated]
ValueCountFrequency (%)
not 132036
25.3%
stated 132034
25.3%
boulder 6789
 
1.3%
creek 6760
 
1.3%
clear 6751
 
1.3%
san 5404
 
1.0%
montgomery 4939
 
0.9%
cochise 4320
 
0.8%
prince 3491
 
0.7%
tuolumne 3205
 
0.6%
Other values (4079) 215253
41.3%
2025-01-08T17:48:07.491810image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 455384
13.8%
a 309851
 
9.4%
e 305684
 
9.2%
o 264700
 
8.0%
171182
 
5.2%
d 169196
 
5.1%
S 152102
 
4.6%
N 137663
 
4.2%
n 133833
 
4.0%
[ 132054
 
4.0%
Other values (88) 1076138
32.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2346650
70.9%
Uppercase Letter 519082
 
15.7%
Space Separator 171182
 
5.2%
Open Punctuation 132072
 
4.0%
Close Punctuation 132032
 
4.0%
Other Punctuation 4600
 
0.1%
Dash Punctuation 2168
 
0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 455384
19.4%
a 309851
13.2%
e 305684
13.0%
o 264700
11.3%
d 169196
 
7.2%
n 133833
 
5.7%
r 128718
 
5.5%
i 96626
 
4.1%
l 92683
 
3.9%
s 72029
 
3.1%
Other values (42) 317946
13.5%
Uppercase Letter
ValueCountFrequency (%)
S 152102
29.3%
N 137663
26.5%
C 39705
 
7.6%
B 24568
 
4.7%
M 21490
 
4.1%
P 16764
 
3.2%
W 13595
 
2.6%
L 12293
 
2.4%
G 12064
 
2.3%
T 10761
 
2.1%
Other values (23) 78077
15.0%
Other Punctuation
ValueCountFrequency (%)
' 3064
66.6%
. 1321
28.7%
, 105
 
2.3%
/ 56
 
1.2%
& 50
 
1.1%
? 4
 
0.1%
Open Punctuation
ValueCountFrequency (%)
[ 132054
> 99.9%
( 18
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
] 132014
> 99.9%
) 18
 
< 0.1%
Space Separator
ValueCountFrequency (%)
171182
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2168
100.0%
Math Symbol
ValueCountFrequency (%)
= 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2865732
86.6%
Common 442055
 
13.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 455384
15.9%
a 309851
10.8%
e 305684
10.7%
o 264700
 
9.2%
d 169196
 
5.9%
S 152102
 
5.3%
N 137663
 
4.8%
n 133833
 
4.7%
r 128718
 
4.5%
i 96626
 
3.4%
Other values (75) 711975
24.8%
Common
ValueCountFrequency (%)
171182
38.7%
[ 132054
29.9%
] 132014
29.9%
' 3064
 
0.7%
- 2168
 
0.5%
. 1321
 
0.3%
, 105
 
< 0.1%
/ 56
 
< 0.1%
& 50
 
< 0.1%
( 18
 
< 0.1%
Other values (3) 23
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3306740
> 99.9%
None 1047
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 455384
13.8%
a 309851
 
9.4%
e 305684
 
9.2%
o 264700
 
8.0%
171182
 
5.2%
d 169196
 
5.1%
S 152102
 
4.6%
N 137663
 
4.2%
n 133833
 
4.0%
[ 132054
 
4.0%
Other values (55) 1075091
32.5%
None
ValueCountFrequency (%)
é 285
27.2%
ó 235
22.4%
ü 123
11.7%
í 99
 
9.5%
ô 74
 
7.1%
Ñ 29
 
2.8%
á 27
 
2.6%
è 18
 
1.7%
ś 16
 
1.5%
ć 15
 
1.4%
Other values (23) 126
12.0%

locality
Text

Missing 

Distinct76610
Distinct (%)17.2%
Missing158340
Missing (%)26.2%
Memory size4.6 MiB
2025-01-08T17:48:07.688327image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length400600
Median length180
Mean length23.74718454
Min length1

Characters and Unicode

Total characters10598036
Distinct characters149
Distinct categories16 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44457 ?
Unique (%)10.0%

Sample

1st row[Not Stated]
2nd rowRio Aquiares, Turrialba
3rd rowSaint Paul Island, Bering Sea
4th rowFalse Cape State Park, Wash Woods, 100 meters east of Interpreter's residence
5th row[Not Stated]
ValueCountFrequency (%)
not 65922
 
4.1%
stated 65846
 
4.1%
of 42709
 
2.7%
miles 21197
 
1.3%
kilometers 15776
 
1.0%
park 15452
 
1.0%
river 15349
 
1.0%
lake 14837
 
0.9%
near 12849
 
0.8%
creek 12664
 
0.8%
Other values (56182) 1322830
82.4%
2025-01-08T17:48:07.951543image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1107495
 
10.5%
a 957781
 
9.0%
e 764045
 
7.2%
o 666061
 
6.3%
t 632687
 
6.0%
n 514072
 
4.9%
i 493725
 
4.7%
r 484811
 
4.6%
l 394187
 
3.7%
s 365079
 
3.4%
Other values (139) 4218093
39.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7051212
66.5%
Uppercase Letter 1363576
 
12.9%
Space Separator 1107495
 
10.5%
Decimal Number 378240
 
3.6%
Other Punctuation 330337
 
3.1%
Control 166947
 
1.6%
Open Punctuation 78089
 
0.7%
Close Punctuation 78075
 
0.7%
Dash Punctuation 31559
 
0.3%
Connector Punctuation 11549
 
0.1%
Other values (6) 957
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 957781
13.6%
e 764045
10.8%
o 666061
9.4%
t 632687
9.0%
n 514072
 
7.3%
i 493725
 
7.0%
r 484811
 
6.9%
l 394187
 
5.6%
s 365079
 
5.2%
u 257241
 
3.6%
Other values (49) 1521523
21.6%
Uppercase Letter
ValueCountFrequency (%)
S 179234
13.1%
C 131330
 
9.6%
N 130256
 
9.6%
R 94760
 
6.9%
P 94322
 
6.9%
M 86675
 
6.4%
B 66017
 
4.8%
L 59567
 
4.4%
A 56700
 
4.2%
T 52375
 
3.8%
Other values (30) 412340
30.2%
Other Punctuation
ValueCountFrequency (%)
, 154021
46.6%
. 80280
24.3%
; 58000
 
17.6%
: 17454
 
5.3%
' 9316
 
2.8%
/ 6838
 
2.1%
" 1470
 
0.4%
? 1401
 
0.4%
& 867
 
0.3%
# 653
 
0.2%
Other values (5) 37
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 67501
17.8%
0 62313
16.5%
2 51275
13.6%
5 35588
9.4%
3 34065
9.0%
4 32368
8.6%
6 26433
 
7.0%
8 24052
 
6.4%
9 22778
 
6.0%
7 21867
 
5.8%
Math Symbol
ValueCountFrequency (%)
+ 299
39.4%
~ 250
33.0%
= 121
16.0%
| 85
 
11.2%
< 2
 
0.3%
> 1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
[ 69831
89.4%
( 8154
 
10.4%
{ 103
 
0.1%
1
 
< 0.1%
Control
ValueCountFrequency (%)
166196
99.6%
749
 
0.4%
 2
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
] 69787
89.4%
) 8145
 
10.4%
} 143
 
0.2%
Modifier Symbol
ValueCountFrequency (%)
´ 3
60.0%
¯ 2
40.0%
Space Separator
ValueCountFrequency (%)
1107495
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 31559
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 11549
100.0%
Other Symbol
ValueCountFrequency (%)
° 112
100.0%
Currency Symbol
ValueCountFrequency (%)
¢ 50
100.0%
Final Punctuation
ValueCountFrequency (%)
26
100.0%
Initial Punctuation
ValueCountFrequency (%)
6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8414788
79.4%
Common 2183248
 
20.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 957781
 
11.4%
e 764045
 
9.1%
o 666061
 
7.9%
t 632687
 
7.5%
n 514072
 
6.1%
i 493725
 
5.9%
r 484811
 
5.8%
l 394187
 
4.7%
s 365079
 
4.3%
u 257241
 
3.1%
Other values (89) 2885099
34.3%
Common
ValueCountFrequency (%)
1107495
50.7%
166196
 
7.6%
, 154021
 
7.1%
. 80280
 
3.7%
[ 69831
 
3.2%
] 69787
 
3.2%
1 67501
 
3.1%
0 62313
 
2.9%
; 58000
 
2.7%
2 51275
 
2.3%
Other values (40) 296549
 
13.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10595457
> 99.9%
None 2545
 
< 0.1%
Punctuation 34
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1107495
 
10.5%
a 957781
 
9.0%
e 764045
 
7.2%
o 666061
 
6.3%
t 632687
 
6.0%
n 514072
 
4.9%
i 493725
 
4.7%
r 484811
 
4.6%
l 394187
 
3.7%
s 365079
 
3.4%
Other values (82) 4215514
39.8%
None
ValueCountFrequency (%)
ñ 374
14.7%
ó 352
13.8%
é 344
13.5%
á 340
13.4%
ã 220
8.6%
ü 181
7.1%
í 149
 
5.9%
ç 117
 
4.6%
° 112
 
4.4%
¢ 50
 
2.0%
Other values (43) 306
12.0%
Punctuation
ValueCountFrequency (%)
26
76.5%
6
 
17.6%
1
 
2.9%
1
 
2.9%

verbatimElevation
Text

Missing 

Distinct1024
Distinct (%)10.3%
Missing594692
Missing (%)98.4%
Memory size4.6 MiB
2025-01-08T17:48:08.119845image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length94
Median length31
Mean length8.08838333
Min length1

Characters and Unicode

Total characters80350
Distinct characters54
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique334 ?
Unique (%)3.4%

Sample

1st row140 meters
2nd row3900 feet
3rd row5940 feet
4th row180 meters
5th row3000 feet
ValueCountFrequency (%)
m 2782
 
14.5%
feet 2472
 
12.9%
meters 1521
 
7.9%
ft 1465
 
7.6%
1000 347
 
1.8%
level 318
 
1.7%
sea 318
 
1.7%
300 305
 
1.6%
near 276
 
1.4%
3200 236
 
1.2%
Other values (619) 9192
47.8%
2025-01-08T17:48:08.346292image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 16889
21.0%
e 9358
11.6%
9298
11.6%
t 5738
 
7.1%
m 5102
 
6.3%
f 4103
 
5.1%
1 4088
 
5.1%
5 3791
 
4.7%
2 2912
 
3.6%
. 2458
 
3.1%
Other values (44) 16613
20.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 36341
45.2%
Lowercase Letter 30893
38.4%
Space Separator 9298
 
11.6%
Other Punctuation 2945
 
3.7%
Dash Punctuation 765
 
1.0%
Uppercase Letter 44
 
0.1%
Open Punctuation 23
 
< 0.1%
Close Punctuation 23
 
< 0.1%
Math Symbol 18
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 9358
30.3%
t 5738
18.6%
m 5102
16.5%
f 4103
13.3%
r 1891
 
6.1%
s 1851
 
6.0%
a 854
 
2.8%
l 695
 
2.2%
n 346
 
1.1%
v 331
 
1.1%
Other values (12) 624
 
2.0%
Decimal Number
ValueCountFrequency (%)
0 16889
46.5%
1 4088
 
11.2%
5 3791
 
10.4%
2 2912
 
8.0%
3 2121
 
5.8%
4 1908
 
5.3%
6 1282
 
3.5%
7 1249
 
3.4%
8 1174
 
3.2%
9 927
 
2.6%
Uppercase Letter
ValueCountFrequency (%)
F 30
68.2%
N 5
 
11.4%
L 3
 
6.8%
A 2
 
4.5%
P 1
 
2.3%
B 1
 
2.3%
S 1
 
2.3%
W 1
 
2.3%
Other Punctuation
ValueCountFrequency (%)
. 2458
83.5%
' 338
 
11.5%
, 126
 
4.3%
& 13
 
0.4%
? 9
 
0.3%
/ 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 22
95.7%
[ 1
 
4.3%
Close Punctuation
ValueCountFrequency (%)
) 22
95.7%
] 1
 
4.3%
Math Symbol
ValueCountFrequency (%)
~ 17
94.4%
+ 1
 
5.6%
Space Separator
ValueCountFrequency (%)
9298
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 765
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 49413
61.5%
Latin 30937
38.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 9358
30.2%
t 5738
18.5%
m 5102
16.5%
f 4103
13.3%
r 1891
 
6.1%
s 1851
 
6.0%
a 854
 
2.8%
l 695
 
2.2%
n 346
 
1.1%
v 331
 
1.1%
Other values (20) 668
 
2.2%
Common
ValueCountFrequency (%)
0 16889
34.2%
9298
18.8%
1 4088
 
8.3%
5 3791
 
7.7%
2 2912
 
5.9%
. 2458
 
5.0%
3 2121
 
4.3%
4 1908
 
3.9%
6 1282
 
2.6%
7 1249
 
2.5%
Other values (14) 3417
 
6.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 80350
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 16889
21.0%
e 9358
11.6%
9298
11.6%
t 5738
 
7.1%
m 5102
 
6.3%
f 4103
 
5.1%
1 4088
 
5.1%
5 3791
 
4.7%
2 2912
 
3.6%
. 2458
 
3.1%
Other values (44) 16613
20.7%

verbatimDepth
Text

Constant  Missing 

Distinct1
Distinct (%)16.7%
Missing604620
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:08.402290image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length25
Mean length25
Min length25

Characters and Unicode

Total characters150
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row220m inside cave entrance
2nd row220m inside cave entrance
3rd row220m inside cave entrance
4th row220m inside cave entrance
5th row220m inside cave entrance
ValueCountFrequency (%)
220m 6
25.0%
inside 6
25.0%
cave 6
25.0%
entrance 6
25.0%
2025-01-08T17:48:08.493829image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 24
16.0%
18
12.0%
n 18
12.0%
2 12
8.0%
i 12
8.0%
c 12
8.0%
a 12
8.0%
0 6
 
4.0%
m 6
 
4.0%
s 6
 
4.0%
Other values (4) 24
16.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 114
76.0%
Space Separator 18
 
12.0%
Decimal Number 18
 
12.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 24
21.1%
n 18
15.8%
i 12
10.5%
c 12
10.5%
a 12
10.5%
m 6
 
5.3%
s 6
 
5.3%
d 6
 
5.3%
v 6
 
5.3%
t 6
 
5.3%
Decimal Number
ValueCountFrequency (%)
2 12
66.7%
0 6
33.3%
Space Separator
ValueCountFrequency (%)
18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 114
76.0%
Common 36
 
24.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 24
21.1%
n 18
15.8%
i 12
10.5%
c 12
10.5%
a 12
10.5%
m 6
 
5.3%
s 6
 
5.3%
d 6
 
5.3%
v 6
 
5.3%
t 6
 
5.3%
Common
ValueCountFrequency (%)
18
50.0%
2 12
33.3%
0 6
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 150
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 24
16.0%
18
12.0%
n 18
12.0%
2 12
8.0%
i 12
8.0%
c 12
8.0%
a 12
8.0%
0 6
 
4.0%
m 6
 
4.0%
s 6
 
4.0%
Other values (4) 24
16.0%
Distinct2
Distinct (%)100.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:08.538224image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length15.5
Mean length15.5
Min length12

Characters and Unicode

Total characters31
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowPoole, R. W.
2nd rowGarrison, Rosser W.
ValueCountFrequency (%)
w 2
33.3%
poole 1
16.7%
r 1
16.7%
garrison 1
16.7%
rosser 1
16.7%
2025-01-08T17:48:08.636577image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 4
12.9%
4
12.9%
. 3
9.7%
r 3
9.7%
s 3
9.7%
e 2
 
6.5%
, 2
 
6.5%
R 2
 
6.5%
W 2
 
6.5%
P 1
 
3.2%
Other values (5) 5
16.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16
51.6%
Uppercase Letter 6
 
19.4%
Other Punctuation 5
 
16.1%
Space Separator 4
 
12.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 4
25.0%
r 3
18.8%
s 3
18.8%
e 2
12.5%
l 1
 
6.2%
a 1
 
6.2%
i 1
 
6.2%
n 1
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
R 2
33.3%
W 2
33.3%
P 1
16.7%
G 1
16.7%
Other Punctuation
ValueCountFrequency (%)
. 3
60.0%
, 2
40.0%
Space Separator
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 22
71.0%
Common 9
29.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 4
18.2%
r 3
13.6%
s 3
13.6%
e 2
9.1%
R 2
9.1%
W 2
9.1%
P 1
 
4.5%
l 1
 
4.5%
G 1
 
4.5%
a 1
 
4.5%
Other values (2) 2
9.1%
Common
ValueCountFrequency (%)
4
44.4%
. 3
33.3%
, 2
22.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 31
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 4
12.9%
4
12.9%
. 3
9.7%
r 3
9.7%
s 3
9.7%
e 2
 
6.5%
, 2
 
6.5%
R 2
 
6.5%
W 2
 
6.5%
P 1
 
3.2%
Other values (5) 5
16.1%

decimalLatitude
Text

Missing 

Distinct38003
Distinct (%)11.9%
Missing285575
Missing (%)47.2%
Memory size4.6 MiB
2025-01-08T17:48:08.822109image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length7
Mean length6.690350446
Min length3

Characters and Unicode

Total characters2134563
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15800 ?
Unique (%)5.0%

Sample

1st row9.91378
2nd row57.18
3rd row36.5787
4th row15.5864
5th row45.4838
ValueCountFrequency (%)
39.6891 5053
 
1.6%
60.75 3839
 
1.2%
60.7493 2462
 
0.8%
40.0925 2379
 
0.7%
38.02 2013
 
0.6%
42.7299 1697
 
0.5%
37.23 1343
 
0.4%
40.015 1287
 
0.4%
42.78 1170
 
0.4%
38.9559 1141
 
0.4%
Other values (37323) 296667
93.0%
2025-01-08T17:48:09.075219image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 319051
14.9%
3 273937
12.8%
4 209092
9.8%
1 188994
8.9%
2 172377
8.1%
9 169623
7.9%
7 165639
7.8%
8 159036
7.5%
5 153205
7.2%
6 152373
7.1%
Other values (3) 171236
8.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1774325
83.1%
Other Punctuation 319051
 
14.9%
Dash Punctuation 41186
 
1.9%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 273937
15.4%
4 209092
11.8%
1 188994
10.7%
2 172377
9.7%
9 169623
9.6%
7 165639
9.3%
8 159036
9.0%
5 153205
8.6%
6 152373
8.6%
0 130049
7.3%
Other Punctuation
ValueCountFrequency (%)
. 319051
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 41186
100.0%
Uppercase Letter
ValueCountFrequency (%)
E 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2134562
> 99.9%
Latin 1
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
. 319051
14.9%
3 273937
12.8%
4 209092
9.8%
1 188994
8.9%
2 172377
8.1%
9 169623
7.9%
7 165639
7.8%
8 159036
7.5%
5 153205
7.2%
6 152373
7.1%
Other values (2) 171235
8.0%
Latin
ValueCountFrequency (%)
E 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2134563
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 319051
14.9%
3 273937
12.8%
4 209092
9.8%
1 188994
8.9%
2 172377
8.1%
9 169623
7.9%
7 165639
7.8%
8 159036
7.5%
5 153205
7.2%
6 152373
7.1%
Other values (3) 171236
8.0%

decimalLongitude
Text

Missing 

Distinct36962
Distinct (%)11.6%
Missing285575
Missing (%)47.2%
Memory size4.6 MiB
2025-01-08T17:48:09.278464image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length8
Mean length7.477729266
Min length3

Characters and Unicode

Total characters2385777
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15095 ?
Unique (%)4.7%

Sample

1st row-83.6744
2nd row-170.27
3rd row-75.8881
4th row-61.4739
5th row-75.9727
ValueCountFrequency (%)
105.644 5103
 
1.6%
139.5 3837
 
1.2%
139.504 2462
 
0.8%
105.358 2379
 
0.7%
87.8123 1697
 
0.5%
119.93 1404
 
0.4%
105.27 1361
 
0.4%
80.4178 1322
 
0.4%
0.365 1301
 
0.4%
87.76 1163
 
0.4%
Other values (36457) 297022
93.1%
2025-01-08T17:48:09.536051image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 319051
13.4%
1 292907
12.3%
- 270810
11.4%
7 217640
9.1%
8 193876
8.1%
6 165493
6.9%
5 162714
6.8%
3 158462
6.6%
2 156819
6.6%
9 154516
6.5%
Other values (2) 293489
12.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1795916
75.3%
Other Punctuation 319051
 
13.4%
Dash Punctuation 270810
 
11.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 292907
16.3%
7 217640
12.1%
8 193876
10.8%
6 165493
9.2%
5 162714
9.1%
3 158462
8.8%
2 156819
8.7%
9 154516
8.6%
4 148470
8.3%
0 145019
8.1%
Other Punctuation
ValueCountFrequency (%)
. 319051
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 270810
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2385777
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 319051
13.4%
1 292907
12.3%
- 270810
11.4%
7 217640
9.1%
8 193876
8.1%
6 165493
6.9%
5 162714
6.8%
3 158462
6.6%
2 156819
6.6%
9 154516
6.5%
Other values (2) 293489
12.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2385777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 319051
13.4%
1 292907
12.3%
- 270810
11.4%
7 217640
9.1%
8 193876
8.1%
6 165493
6.9%
5 162714
6.8%
3 158462
6.6%
2 156819
6.6%
9 154516
6.5%
Other values (2) 293489
12.3%
Distinct1493
Distinct (%)12.5%
Missing592674
Missing (%)98.0%
Memory size4.6 MiB
2025-01-08T17:48:09.841757image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length6
Mean length6.138386881
Min length4

Characters and Unicode

Total characters73366
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique745 ?
Unique (%)6.2%

Sample

1st row931.0
2nd row10206.0
3rd row6642.0
4th row3036.0
5th row301.0
ValueCountFrequency (%)
3036.0 1744
 
14.6%
301.0 466
 
3.9%
34239.0 426
 
3.6%
1189.0 258
 
2.2%
20000.0 247
 
2.1%
3048.0 220
 
1.8%
15000.0 199
 
1.7%
52150.0 194
 
1.6%
14563.0 162
 
1.4%
9346.0 135
 
1.1%
Other values (1483) 7901
66.1%
2025-01-08T17:48:10.090274image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 21190
28.9%
. 11952
16.3%
3 8252
 
11.2%
1 6352
 
8.7%
2 4892
 
6.7%
6 4647
 
6.3%
4 3910
 
5.3%
5 3500
 
4.8%
9 3065
 
4.2%
8 2861
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 61414
83.7%
Other Punctuation 11952
 
16.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 21190
34.5%
3 8252
 
13.4%
1 6352
 
10.3%
2 4892
 
8.0%
6 4647
 
7.6%
4 3910
 
6.4%
5 3500
 
5.7%
9 3065
 
5.0%
8 2861
 
4.7%
7 2745
 
4.5%
Other Punctuation
ValueCountFrequency (%)
. 11952
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 73366
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 21190
28.9%
. 11952
16.3%
3 8252
 
11.2%
1 6352
 
8.7%
2 4892
 
6.7%
6 4647
 
6.3%
4 3910
 
5.3%
5 3500
 
4.8%
9 3065
 
4.2%
8 2861
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 73366
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 21190
28.9%
. 11952
16.3%
3 8252
 
11.2%
1 6352
 
8.7%
2 4892
 
6.7%
6 4647
 
6.3%
4 3910
 
5.3%
5 3500
 
4.8%
9 3065
 
4.2%
8 2861
 
3.9%

pointRadiusSpatialFit
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:10.145816image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters14
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row1937011
2nd row1424710
ValueCountFrequency (%)
1937011 1
50.0%
1424710 1
50.0%
2025-01-08T17:48:10.233393image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 5
35.7%
7 2
 
14.3%
0 2
 
14.3%
4 2
 
14.3%
9 1
 
7.1%
3 1
 
7.1%
2 1
 
7.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 14
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 5
35.7%
7 2
 
14.3%
0 2
 
14.3%
4 2
 
14.3%
9 1
 
7.1%
3 1
 
7.1%
2 1
 
7.1%

Most occurring scripts

ValueCountFrequency (%)
Common 14
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 5
35.7%
7 2
 
14.3%
0 2
 
14.3%
4 2
 
14.3%
9 1
 
7.1%
3 1
 
7.1%
2 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 5
35.7%
7 2
 
14.3%
0 2
 
14.3%
4 2
 
14.3%
9 1
 
7.1%
3 1
 
7.1%
2 1
 
7.1%

verbatimCoordinateSystem
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604625
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:10.274392image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length23
Mean length23
Min length23

Characters and Unicode

Total characters23
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowDegrees Minutes Seconds
ValueCountFrequency (%)
degrees 1
33.3%
minutes 1
33.3%
seconds 1
33.3%
2025-01-08T17:48:10.365412image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 5
21.7%
s 3
13.0%
2
 
8.7%
n 2
 
8.7%
D 1
 
4.3%
g 1
 
4.3%
r 1
 
4.3%
M 1
 
4.3%
i 1
 
4.3%
u 1
 
4.3%
Other values (5) 5
21.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 18
78.3%
Uppercase Letter 3
 
13.0%
Space Separator 2
 
8.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 5
27.8%
s 3
16.7%
n 2
 
11.1%
g 1
 
5.6%
r 1
 
5.6%
i 1
 
5.6%
u 1
 
5.6%
t 1
 
5.6%
c 1
 
5.6%
o 1
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
D 1
33.3%
M 1
33.3%
S 1
33.3%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 21
91.3%
Common 2
 
8.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 5
23.8%
s 3
14.3%
n 2
 
9.5%
D 1
 
4.8%
g 1
 
4.8%
r 1
 
4.8%
M 1
 
4.8%
i 1
 
4.8%
u 1
 
4.8%
t 1
 
4.8%
Other values (4) 4
19.0%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 5
21.7%
s 3
13.0%
2
 
8.7%
n 2
 
8.7%
D 1
 
4.3%
g 1
 
4.3%
r 1
 
4.3%
M 1
 
4.3%
i 1
 
4.3%
u 1
 
4.3%
Other values (5) 5
21.7%

verbatimSRS
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604625
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:10.409410image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters10
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row1973-05-08
ValueCountFrequency (%)
1973-05-08 1
100.0%
2025-01-08T17:48:10.496284image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 2
20.0%
0 2
20.0%
1 1
10.0%
9 1
10.0%
7 1
10.0%
3 1
10.0%
5 1
10.0%
8 1
10.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8
80.0%
Dash Punctuation 2
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2
25.0%
1 1
12.5%
9 1
12.5%
7 1
12.5%
3 1
12.5%
5 1
12.5%
8 1
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 2
20.0%
0 2
20.0%
1 1
10.0%
9 1
10.0%
7 1
10.0%
3 1
10.0%
5 1
10.0%
8 1
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 2
20.0%
0 2
20.0%
1 1
10.0%
9 1
10.0%
7 1
10.0%
3 1
10.0%
5 1
10.0%
8 1
10.0%

footprintSRS
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604625
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:10.537642image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters3
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row128
ValueCountFrequency (%)
128 1
100.0%
2025-01-08T17:48:10.623839image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1
33.3%
2 1
33.3%
8 1
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1
33.3%
2 1
33.3%
8 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 3
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1
33.3%
2 1
33.3%
8 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1
33.3%
2 1
33.3%
8 1
33.3%

footprintSpatialFit
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604625
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:10.663346image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters3
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row128
ValueCountFrequency (%)
128 1
100.0%
2025-01-08T17:48:10.748345image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1
33.3%
2 1
33.3%
8 1
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1
33.3%
2 1
33.3%
8 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 3
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1
33.3%
2 1
33.3%
8 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1
33.3%
2 1
33.3%
8 1
33.3%

georeferencedBy
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing604623
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:10.801346image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length35
Median length32
Mean length23.66666667
Min length4

Characters and Unicode

Total characters71
Distinct characters30
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st rowTroides amphrysus (Cramer, 1779)
2nd rowGynacantha membranalis Karsch, 1891
3rd row1973
ValueCountFrequency (%)
troides 1
11.1%
amphrysus 1
11.1%
cramer 1
11.1%
1779 1
11.1%
gynacantha 1
11.1%
membranalis 1
11.1%
karsch 1
11.1%
1891 1
11.1%
1973 1
11.1%
2025-01-08T17:48:10.908431image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 8
 
11.3%
r 6
 
8.5%
6
 
8.5%
s 5
 
7.0%
m 4
 
5.6%
1 4
 
5.6%
9 3
 
4.2%
7 3
 
4.2%
e 3
 
4.2%
n 3
 
4.2%
Other values (20) 26
36.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 45
63.4%
Decimal Number 12
 
16.9%
Space Separator 6
 
8.5%
Uppercase Letter 4
 
5.6%
Other Punctuation 2
 
2.8%
Close Punctuation 1
 
1.4%
Open Punctuation 1
 
1.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8
17.8%
r 6
13.3%
s 5
11.1%
m 4
8.9%
e 3
 
6.7%
n 3
 
6.7%
h 3
 
6.7%
c 2
 
4.4%
i 2
 
4.4%
y 2
 
4.4%
Other values (7) 7
15.6%
Decimal Number
ValueCountFrequency (%)
1 4
33.3%
9 3
25.0%
7 3
25.0%
8 1
 
8.3%
3 1
 
8.3%
Uppercase Letter
ValueCountFrequency (%)
T 1
25.0%
K 1
25.0%
C 1
25.0%
G 1
25.0%
Space Separator
ValueCountFrequency (%)
6
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 49
69.0%
Common 22
31.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8
16.3%
r 6
12.2%
s 5
10.2%
m 4
 
8.2%
e 3
 
6.1%
n 3
 
6.1%
h 3
 
6.1%
c 2
 
4.1%
i 2
 
4.1%
y 2
 
4.1%
Other values (11) 11
22.4%
Common
ValueCountFrequency (%)
6
27.3%
1 4
18.2%
9 3
13.6%
7 3
13.6%
, 2
 
9.1%
8 1
 
4.5%
) 1
 
4.5%
( 1
 
4.5%
3 1
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 71
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 8
 
11.3%
r 6
 
8.5%
6
 
8.5%
s 5
 
7.0%
m 4
 
5.6%
1 4
 
5.6%
9 3
 
4.2%
7 3
 
4.2%
e 3
 
4.2%
n 3
 
4.2%
Other values (20) 26
36.6%

georeferencedDate
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604625
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:10.949851image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row5
ValueCountFrequency (%)
5 1
100.0%
2025-01-08T17:48:11.036852image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 1
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 1
100.0%

georeferenceProtocol
Text

Missing 

Distinct65
Distinct (%)< 0.1%
Missing366755
Missing (%)60.7%
Memory size4.6 MiB
2025-01-08T17:48:11.110338image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length72
Median length12
Mean length10.94743369
Min length1

Characters and Unicode

Total characters2604077
Distinct characters61
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique22 ?
Unique (%)< 0.1%

Sample

1st rowGoogle Maps
2nd rowGoogle Earth
3rd rowGoogle Earth
4th rowGEOLocate
5th rowGoogle Earth
ValueCountFrequency (%)
google 163378
40.4%
earth 120763
29.8%
geolocate 70753
17.5%
maps 42641
 
10.5%
gps 1516
 
0.4%
coordinates 782
 
0.2%
centroid 781
 
0.2%
geonames 718
 
0.2%
from 711
 
0.2%
country 671
 
0.2%
Other values (106) 2062
 
0.5%
2025-01-08T17:48:11.248230image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 402567
15.5%
e 238609
9.2%
a 237477
9.1%
G 236541
9.1%
t 194803
7.5%
E 191420
7.4%
l 169480
 
6.5%
166905
 
6.4%
g 163810
 
6.3%
r 124366
 
4.8%
Other values (51) 478099
18.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1823782
70.0%
Uppercase Letter 612188
 
23.5%
Space Separator 166905
 
6.4%
Decimal Number 942
 
< 0.1%
Other Punctuation 250
 
< 0.1%
Dash Punctuation 10
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 402567
22.1%
e 238609
13.1%
a 237477
13.0%
t 194803
10.7%
l 169480
9.3%
g 163810
9.0%
r 124366
 
6.8%
h 120864
 
6.6%
c 72653
 
4.0%
s 44346
 
2.4%
Other values (14) 54807
 
3.0%
Uppercase Letter
ValueCountFrequency (%)
G 236541
38.6%
E 191420
31.3%
O 70680
 
11.5%
L 65526
 
10.7%
M 42654
 
7.0%
S 1607
 
0.3%
P 1564
 
0.3%
C 982
 
0.2%
N 744
 
0.1%
B 158
 
< 0.1%
Other values (8) 312
 
0.1%
Decimal Number
ValueCountFrequency (%)
9 213
22.6%
1 200
21.2%
7 175
18.6%
2 170
18.0%
0 94
10.0%
6 48
 
5.1%
8 17
 
1.8%
4 14
 
1.5%
3 9
 
1.0%
5 2
 
0.2%
Other Punctuation
ValueCountFrequency (%)
, 85
34.0%
& 49
19.6%
/ 48
19.2%
. 43
17.2%
: 21
 
8.4%
" 2
 
0.8%
; 2
 
0.8%
Space Separator
ValueCountFrequency (%)
166905
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2435970
93.5%
Common 168107
 
6.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 402567
16.5%
e 238609
9.8%
a 237477
9.7%
G 236541
9.7%
t 194803
8.0%
E 191420
7.9%
l 169480
7.0%
g 163810
6.7%
r 124366
 
5.1%
h 120864
 
5.0%
Other values (32) 356033
14.6%
Common
ValueCountFrequency (%)
166905
99.3%
9 213
 
0.1%
1 200
 
0.1%
7 175
 
0.1%
2 170
 
0.1%
0 94
 
0.1%
, 85
 
0.1%
& 49
 
< 0.1%
/ 48
 
< 0.1%
6 48
 
< 0.1%
Other values (9) 120
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2604077
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 402567
15.5%
e 238609
9.2%
a 237477
9.1%
G 236541
9.1%
t 194803
7.5%
E 191420
7.4%
l 169480
 
6.5%
166905
 
6.4%
g 163810
 
6.3%
r 124366
 
4.8%
Other values (51) 478099
18.4%

georeferenceSources
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:11.292232image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7.5
Mean length7.5
Min length7

Characters and Unicode

Total characters15
Distinct characters13
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row9 March
2nd row8.v.1973
ValueCountFrequency (%)
9 1
33.3%
march 1
33.3%
8.v.1973 1
33.3%
2025-01-08T17:48:11.382968image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 2
13.3%
. 2
13.3%
1
 
6.7%
M 1
 
6.7%
a 1
 
6.7%
r 1
 
6.7%
c 1
 
6.7%
h 1
 
6.7%
8 1
 
6.7%
v 1
 
6.7%
Other values (3) 3
20.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
40.0%
Lowercase Letter 5
33.3%
Other Punctuation 2
 
13.3%
Space Separator 1
 
6.7%
Uppercase Letter 1
 
6.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 2
33.3%
8 1
16.7%
1 1
16.7%
7 1
16.7%
3 1
16.7%
Lowercase Letter
ValueCountFrequency (%)
a 1
20.0%
r 1
20.0%
c 1
20.0%
h 1
20.0%
v 1
20.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%
Uppercase Letter
ValueCountFrequency (%)
M 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 9
60.0%
Latin 6
40.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9 2
22.2%
. 2
22.2%
1
11.1%
8 1
11.1%
1 1
11.1%
7 1
11.1%
3 1
11.1%
Latin
ValueCountFrequency (%)
M 1
16.7%
a 1
16.7%
r 1
16.7%
c 1
16.7%
h 1
16.7%
v 1
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 2
13.3%
. 2
13.3%
1
 
6.7%
M 1
 
6.7%
a 1
 
6.7%
r 1
 
6.7%
c 1
 
6.7%
h 1
 
6.7%
8 1
 
6.7%
v 1
 
6.7%
Other values (3) 3
20.0%

georeferenceRemarks
Text

Missing 

Distinct1134
Distinct (%)13.4%
Missing596178
Missing (%)98.6%
Memory size4.6 MiB
2025-01-08T17:48:11.553977image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length200
Median length182
Mean length45.17341383
Min length10

Characters and Unicode

Total characters381625
Distinct characters69
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique400 ?
Unique (%)4.7%

Sample

1st rowCoordinate Uncertainty In Meters: 56182
2nd rowCoordinate Uncertainty In Meters: 49611
3rd rowCoordinate Uncertainty In Meters: 97700
4th rowCoordinate Uncertainty In Meters: 41787
5th rowCoordinate Uncertainty In Meters: 71236
ValueCountFrequency (%)
in 8278
17.4%
coordinate 8139
17.1%
meters 8139
17.1%
uncertainty 8139
17.1%
verbatim 1307
 
2.7%
coordinate-degrees 1307
 
2.7%
minutes 1307
 
2.7%
3792 274
 
0.6%
the 221
 
0.5%
6066 174
 
0.4%
Other values (1273) 10423
21.8%
2025-01-08T17:48:11.799799image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 42267
 
11.1%
39260
 
10.3%
t 37512
 
9.8%
n 36163
 
9.5%
r 29378
 
7.7%
i 21344
 
5.6%
o 20135
 
5.3%
a 19989
 
5.2%
s 11758
 
3.1%
d 9749
 
2.6%
Other values (59) 114070
29.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 254955
66.8%
Space Separator 39260
 
10.3%
Decimal Number 38767
 
10.2%
Uppercase Letter 37565
 
9.8%
Other Punctuation 9665
 
2.5%
Dash Punctuation 1342
 
0.4%
Open Punctuation 33
 
< 0.1%
Close Punctuation 33
 
< 0.1%
Initial Punctuation 2
 
< 0.1%
Final Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 42267
16.6%
t 37512
14.7%
n 36163
14.2%
r 29378
11.5%
i 21344
8.4%
o 20135
7.9%
a 19989
7.8%
s 11758
 
4.6%
d 9749
 
3.8%
c 8645
 
3.4%
Other values (16) 18015
7.1%
Uppercase Letter
ValueCountFrequency (%)
C 9645
25.7%
M 8186
21.8%
U 8173
21.8%
I 8160
21.7%
D 1329
 
3.5%
V 1307
 
3.5%
T 264
 
0.7%
N 88
 
0.2%
S 85
 
0.2%
G 82
 
0.2%
Other values (10) 246
 
0.7%
Decimal Number
ValueCountFrequency (%)
1 4554
11.7%
6 4450
11.5%
0 4424
11.4%
3 4271
11.0%
2 4116
10.6%
5 3996
10.3%
4 3411
8.8%
7 3300
8.5%
9 3146
8.1%
8 3099
8.0%
Other Punctuation
ValueCountFrequency (%)
: 8139
84.2%
; 1326
 
13.7%
, 101
 
1.0%
. 90
 
0.9%
' 5
 
0.1%
" 4
 
< 0.1%
Space Separator
ValueCountFrequency (%)
39260
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1342
100.0%
Open Punctuation
ValueCountFrequency (%)
( 33
100.0%
Close Punctuation
ValueCountFrequency (%)
) 33
100.0%
Initial Punctuation
ValueCountFrequency (%)
2
100.0%
Final Punctuation
ValueCountFrequency (%)
2
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 292520
76.7%
Common 89105
 
23.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 42267
14.4%
t 37512
12.8%
n 36163
12.4%
r 29378
10.0%
i 21344
 
7.3%
o 20135
 
6.9%
a 19989
 
6.8%
s 11758
 
4.0%
d 9749
 
3.3%
C 9645
 
3.3%
Other values (36) 54580
18.7%
Common
ValueCountFrequency (%)
39260
44.1%
: 8139
 
9.1%
1 4554
 
5.1%
6 4450
 
5.0%
0 4424
 
5.0%
3 4271
 
4.8%
2 4116
 
4.6%
5 3996
 
4.5%
4 3411
 
3.8%
7 3300
 
3.7%
Other values (13) 9184
 
10.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 381616
> 99.9%
None 5
 
< 0.1%
Punctuation 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 42267
 
11.1%
39260
 
10.3%
t 37512
 
9.8%
n 36163
 
9.5%
r 29378
 
7.7%
i 21344
 
5.6%
o 20135
 
5.3%
a 19989
 
5.2%
s 11758
 
3.1%
d 9749
 
2.6%
Other values (56) 114061
29.9%
None
ValueCountFrequency (%)
ñ 5
100.0%
Punctuation
ValueCountFrequency (%)
2
50.0%
2
50.0%
Distinct2
Distinct (%)100.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:11.866182image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length70
Median length65.5
Mean length65.5
Min length61

Characters and Unicode

Total characters131
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowAnimalia, Arthropoda, Insecta, Lepidoptera, Papilionidae, Papilioninae
2nd rowAnimalia, Arthropoda, Insecta, Odonata, Anisoptera, Aeshnidae
ValueCountFrequency (%)
animalia 2
16.7%
arthropoda 2
16.7%
insecta 2
16.7%
lepidoptera 1
8.3%
papilionidae 1
8.3%
papilioninae 1
8.3%
odonata 1
8.3%
anisoptera 1
8.3%
aeshnidae 1
8.3%
2025-01-08T17:48:11.969942image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 17
13.0%
i 13
 
9.9%
, 10
 
7.6%
10
 
7.6%
n 10
 
7.6%
e 9
 
6.9%
o 9
 
6.9%
t 7
 
5.3%
p 7
 
5.3%
A 6
 
4.6%
Other values (11) 33
25.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 99
75.6%
Uppercase Letter 12
 
9.2%
Other Punctuation 10
 
7.6%
Space Separator 10
 
7.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 17
17.2%
i 13
13.1%
n 10
10.1%
e 9
9.1%
o 9
9.1%
t 7
7.1%
p 7
7.1%
r 6
 
6.1%
d 6
 
6.1%
s 4
 
4.0%
Other values (4) 11
11.1%
Uppercase Letter
ValueCountFrequency (%)
A 6
50.0%
I 2
 
16.7%
P 2
 
16.7%
L 1
 
8.3%
O 1
 
8.3%
Other Punctuation
ValueCountFrequency (%)
, 10
100.0%
Space Separator
ValueCountFrequency (%)
10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 111
84.7%
Common 20
 
15.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 17
15.3%
i 13
11.7%
n 10
9.0%
e 9
8.1%
o 9
8.1%
t 7
 
6.3%
p 7
 
6.3%
A 6
 
5.4%
r 6
 
5.4%
d 6
 
5.4%
Other values (9) 21
18.9%
Common
ValueCountFrequency (%)
, 10
50.0%
10
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 131
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 17
13.0%
i 13
 
9.9%
, 10
 
7.6%
10
 
7.6%
n 10
 
7.6%
e 9
 
6.9%
o 9
 
6.9%
t 7
 
5.3%
p 7
 
5.3%
A 6
 
4.6%
Other values (11) 33
25.2%

earliestEraOrLowestErathem
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:12.011750image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters16
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAnimalia
2nd rowAnimalia
ValueCountFrequency (%)
animalia 2
100.0%
2025-01-08T17:48:12.095426image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 4
25.0%
a 4
25.0%
A 2
12.5%
n 2
12.5%
m 2
12.5%
l 2
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 14
87.5%
Uppercase Letter 2
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 4
28.6%
a 4
28.6%
n 2
14.3%
m 2
14.3%
l 2
14.3%
Uppercase Letter
ValueCountFrequency (%)
A 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 4
25.0%
a 4
25.0%
A 2
12.5%
n 2
12.5%
m 2
12.5%
l 2
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 4
25.0%
a 4
25.0%
A 2
12.5%
n 2
12.5%
m 2
12.5%
l 2
12.5%

latestEraOrHighestErathem
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:12.135746image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters20
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowArthropoda
2nd rowArthropoda
ValueCountFrequency (%)
arthropoda 2
100.0%
2025-01-08T17:48:12.222592image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 4
20.0%
o 4
20.0%
A 2
10.0%
t 2
10.0%
h 2
10.0%
p 2
10.0%
d 2
10.0%
a 2
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 18
90.0%
Uppercase Letter 2
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 4
22.2%
o 4
22.2%
t 2
11.1%
h 2
11.1%
p 2
11.1%
d 2
11.1%
a 2
11.1%
Uppercase Letter
ValueCountFrequency (%)
A 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 20
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 4
20.0%
o 4
20.0%
A 2
10.0%
t 2
10.0%
h 2
10.0%
p 2
10.0%
d 2
10.0%
a 2
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 4
20.0%
o 4
20.0%
A 2
10.0%
t 2
10.0%
h 2
10.0%
p 2
10.0%
d 2
10.0%
a 2
10.0%

earliestPeriodOrLowestSystem
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:12.262373image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters14
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowInsecta
2nd rowInsecta
ValueCountFrequency (%)
insecta 2
100.0%
2025-01-08T17:48:12.344813image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 2
14.3%
n 2
14.3%
s 2
14.3%
e 2
14.3%
c 2
14.3%
t 2
14.3%
a 2
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12
85.7%
Uppercase Letter 2
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 2
16.7%
s 2
16.7%
e 2
16.7%
c 2
16.7%
t 2
16.7%
a 2
16.7%
Uppercase Letter
ValueCountFrequency (%)
I 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 14
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 2
14.3%
n 2
14.3%
s 2
14.3%
e 2
14.3%
c 2
14.3%
t 2
14.3%
a 2
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 2
14.3%
n 2
14.3%
s 2
14.3%
e 2
14.3%
c 2
14.3%
t 2
14.3%
a 2
14.3%
Distinct2
Distinct (%)100.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:12.388813image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length9
Mean length9
Min length7

Characters and Unicode

Total characters18
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowLepidoptera
2nd rowOdonata
ValueCountFrequency (%)
lepidoptera 1
50.0%
odonata 1
50.0%
2025-01-08T17:48:12.489955image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3
16.7%
e 2
11.1%
p 2
11.1%
d 2
11.1%
o 2
11.1%
t 2
11.1%
L 1
 
5.6%
i 1
 
5.6%
r 1
 
5.6%
O 1
 
5.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16
88.9%
Uppercase Letter 2
 
11.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3
18.8%
e 2
12.5%
p 2
12.5%
d 2
12.5%
o 2
12.5%
t 2
12.5%
i 1
 
6.2%
r 1
 
6.2%
n 1
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
L 1
50.0%
O 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3
16.7%
e 2
11.1%
p 2
11.1%
d 2
11.1%
o 2
11.1%
t 2
11.1%
L 1
 
5.6%
i 1
 
5.6%
r 1
 
5.6%
O 1
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3
16.7%
e 2
11.1%
p 2
11.1%
d 2
11.1%
o 2
11.1%
t 2
11.1%
L 1
 
5.6%
i 1
 
5.6%
r 1
 
5.6%
O 1
 
5.6%
Distinct4
Distinct (%)100.0%
Missing604622
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:12.544221image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length10.5
Mean length14.25
Min length4

Characters and Unicode

Total characters57
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st rowPapilionidae
2nd rowUnited States, Florida, Pinellas
3rd rowAeshnidae
4th rowPeru
ValueCountFrequency (%)
papilionidae 1
14.3%
united 1
14.3%
states 1
14.3%
florida 1
14.3%
pinellas 1
14.3%
aeshnidae 1
14.3%
peru 1
14.3%
2025-01-08T17:48:12.644892image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 7
12.3%
e 7
12.3%
a 6
10.5%
l 4
 
7.0%
n 4
 
7.0%
d 4
 
7.0%
P 3
 
5.3%
s 3
 
5.3%
3
 
5.3%
t 3
 
5.3%
Other values (10) 13
22.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 45
78.9%
Uppercase Letter 7
 
12.3%
Space Separator 3
 
5.3%
Other Punctuation 2
 
3.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 7
15.6%
e 7
15.6%
a 6
13.3%
l 4
8.9%
n 4
8.9%
d 4
8.9%
s 3
6.7%
t 3
6.7%
o 2
 
4.4%
r 2
 
4.4%
Other values (3) 3
6.7%
Uppercase Letter
ValueCountFrequency (%)
P 3
42.9%
U 1
 
14.3%
S 1
 
14.3%
F 1
 
14.3%
A 1
 
14.3%
Space Separator
ValueCountFrequency (%)
3
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 52
91.2%
Common 5
 
8.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 7
13.5%
e 7
13.5%
a 6
11.5%
l 4
7.7%
n 4
7.7%
d 4
7.7%
P 3
 
5.8%
s 3
 
5.8%
t 3
 
5.8%
o 2
 
3.8%
Other values (8) 9
17.3%
Common
ValueCountFrequency (%)
3
60.0%
, 2
40.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 57
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 7
12.3%
e 7
12.3%
a 6
10.5%
l 4
 
7.0%
n 4
 
7.0%
d 4
 
7.0%
P 3
 
5.3%
s 3
 
5.3%
3
 
5.3%
t 3
 
5.3%
Other values (10) 13
22.8%
Distinct2
Distinct (%)100.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:12.691422image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters26
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowNORTH_AMERICA
2nd rowSOUTH_AMERICA
ValueCountFrequency (%)
north_america 1
50.0%
south_america 1
50.0%
2025-01-08T17:48:12.783082image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 4
15.4%
R 3
11.5%
O 2
7.7%
T 2
7.7%
H 2
7.7%
_ 2
7.7%
M 2
7.7%
E 2
7.7%
I 2
7.7%
C 2
7.7%
Other values (3) 3
11.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 24
92.3%
Connector Punctuation 2
 
7.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 4
16.7%
R 3
12.5%
O 2
8.3%
T 2
8.3%
H 2
8.3%
M 2
8.3%
E 2
8.3%
I 2
8.3%
C 2
8.3%
N 1
 
4.2%
Other values (2) 2
8.3%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 24
92.3%
Common 2
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 4
16.7%
R 3
12.5%
O 2
8.3%
T 2
8.3%
H 2
8.3%
M 2
8.3%
E 2
8.3%
I 2
8.3%
C 2
8.3%
N 1
 
4.2%
Other values (2) 2
8.3%
Common
ValueCountFrequency (%)
_ 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 4
15.4%
R 3
11.5%
O 2
7.7%
T 2
7.7%
H 2
7.7%
_ 2
7.7%
M 2
7.7%
E 2
7.7%
I 2
7.7%
C 2
7.7%
Other values (3) 3
11.5%
Distinct2
Distinct (%)100.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:12.825797image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length8.5
Mean length8.5
Min length7

Characters and Unicode

Total characters17
Distinct characters14
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowTroides
2nd rowGynacantha
ValueCountFrequency (%)
troides 1
50.0%
gynacantha 1
50.0%
2025-01-08T17:48:12.925109image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3
17.6%
n 2
11.8%
T 1
 
5.9%
r 1
 
5.9%
o 1
 
5.9%
i 1
 
5.9%
d 1
 
5.9%
e 1
 
5.9%
s 1
 
5.9%
G 1
 
5.9%
Other values (4) 4
23.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15
88.2%
Uppercase Letter 2
 
11.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3
20.0%
n 2
13.3%
r 1
 
6.7%
o 1
 
6.7%
i 1
 
6.7%
d 1
 
6.7%
e 1
 
6.7%
s 1
 
6.7%
y 1
 
6.7%
c 1
 
6.7%
Other values (2) 2
13.3%
Uppercase Letter
ValueCountFrequency (%)
T 1
50.0%
G 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 17
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3
17.6%
n 2
11.8%
T 1
 
5.9%
r 1
 
5.9%
o 1
 
5.9%
i 1
 
5.9%
d 1
 
5.9%
e 1
 
5.9%
s 1
 
5.9%
G 1
 
5.9%
Other values (4) 4
23.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3
17.6%
n 2
11.8%
T 1
 
5.9%
r 1
 
5.9%
o 1
 
5.9%
i 1
 
5.9%
d 1
 
5.9%
e 1
 
5.9%
s 1
 
5.9%
G 1
 
5.9%
Other values (4) 4
23.5%
Distinct4
Distinct (%)100.0%
Missing604622
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:12.969424image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length8.5
Mean length5.25
Min length2

Characters and Unicode

Total characters21
Distinct characters18
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st rowTroides
2nd rowUS
3rd rowGynacantha
4th rowPE
ValueCountFrequency (%)
troides 1
25.0%
us 1
25.0%
gynacantha 1
25.0%
pe 1
25.0%
2025-01-08T17:48:13.072838image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3
 
14.3%
n 2
 
9.5%
T 1
 
4.8%
r 1
 
4.8%
P 1
 
4.8%
h 1
 
4.8%
t 1
 
4.8%
c 1
 
4.8%
y 1
 
4.8%
G 1
 
4.8%
Other values (8) 8
38.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15
71.4%
Uppercase Letter 6
 
28.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3
20.0%
n 2
13.3%
r 1
 
6.7%
h 1
 
6.7%
t 1
 
6.7%
c 1
 
6.7%
y 1
 
6.7%
s 1
 
6.7%
e 1
 
6.7%
d 1
 
6.7%
Other values (2) 2
13.3%
Uppercase Letter
ValueCountFrequency (%)
T 1
16.7%
P 1
16.7%
G 1
16.7%
S 1
16.7%
U 1
16.7%
E 1
16.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 21
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3
 
14.3%
n 2
 
9.5%
T 1
 
4.8%
r 1
 
4.8%
P 1
 
4.8%
h 1
 
4.8%
t 1
 
4.8%
c 1
 
4.8%
y 1
 
4.8%
G 1
 
4.8%
Other values (8) 8
38.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3
 
14.3%
n 2
 
9.5%
T 1
 
4.8%
r 1
 
4.8%
P 1
 
4.8%
h 1
 
4.8%
t 1
 
4.8%
c 1
 
4.8%
y 1
 
4.8%
G 1
 
4.8%
Other values (8) 8
38.1%

group
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604625
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:13.113838image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowFlorida
ValueCountFrequency (%)
florida 1
100.0%
2025-01-08T17:48:13.200032image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
F 1
14.3%
l 1
14.3%
o 1
14.3%
r 1
14.3%
i 1
14.3%
d 1
14.3%
a 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6
85.7%
Uppercase Letter 1
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 1
16.7%
o 1
16.7%
r 1
16.7%
i 1
16.7%
d 1
16.7%
a 1
16.7%
Uppercase Letter
ValueCountFrequency (%)
F 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
F 1
14.3%
l 1
14.3%
o 1
14.3%
r 1
14.3%
i 1
14.3%
d 1
14.3%
a 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
F 1
14.3%
l 1
14.3%
o 1
14.3%
r 1
14.3%
i 1
14.3%
d 1
14.3%
a 1
14.3%

formation
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604625
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:13.239710image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowPinellas
ValueCountFrequency (%)
pinellas 1
100.0%
2025-01-08T17:48:13.323283image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 2
25.0%
P 1
12.5%
i 1
12.5%
n 1
12.5%
e 1
12.5%
a 1
12.5%
s 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7
87.5%
Uppercase Letter 1
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 2
28.6%
i 1
14.3%
n 1
14.3%
e 1
14.3%
a 1
14.3%
s 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
P 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 2
25.0%
P 1
12.5%
i 1
12.5%
n 1
12.5%
e 1
12.5%
a 1
12.5%
s 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 2
25.0%
P 1
12.5%
i 1
12.5%
n 1
12.5%
e 1
12.5%
a 1
12.5%
s 1
12.5%

member
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:13.366282image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length10
Mean length10
Min length9

Characters and Unicode

Total characters20
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowamphrysus
2nd rowmembranalis
ValueCountFrequency (%)
amphrysus 1
50.0%
membranalis 1
50.0%
2025-01-08T17:48:13.473589image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3
15.0%
m 3
15.0%
s 3
15.0%
r 2
10.0%
p 1
 
5.0%
h 1
 
5.0%
y 1
 
5.0%
u 1
 
5.0%
e 1
 
5.0%
b 1
 
5.0%
Other values (3) 3
15.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 20
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3
15.0%
m 3
15.0%
s 3
15.0%
r 2
10.0%
p 1
 
5.0%
h 1
 
5.0%
y 1
 
5.0%
u 1
 
5.0%
e 1
 
5.0%
b 1
 
5.0%
Other values (3) 3
15.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 20
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3
15.0%
m 3
15.0%
s 3
15.0%
r 2
10.0%
p 1
 
5.0%
h 1
 
5.0%
y 1
 
5.0%
u 1
 
5.0%
e 1
 
5.0%
b 1
 
5.0%
Other values (3) 3
15.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3
15.0%
m 3
15.0%
s 3
15.0%
r 2
10.0%
p 1
 
5.0%
h 1
 
5.0%
y 1
 
5.0%
u 1
 
5.0%
e 1
 
5.0%
b 1
 
5.0%
Other values (3) 3
15.0%

bed
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:13.525429image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length31
Median length22.5
Mean length22.5
Min length14

Characters and Unicode

Total characters45
Distinct characters25
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowSt. Petersburg
2nd rowHuaru Valley, 90 mi. N. of Lima
ValueCountFrequency (%)
st 1
11.1%
petersburg 1
11.1%
huaru 1
11.1%
valley 1
11.1%
90 1
11.1%
mi 1
11.1%
n 1
11.1%
of 1
11.1%
lima 1
11.1%
2025-01-08T17:48:13.632867image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7
15.6%
a 3
 
6.7%
. 3
 
6.7%
e 3
 
6.7%
r 3
 
6.7%
u 3
 
6.7%
i 2
 
4.4%
m 2
 
4.4%
t 2
 
4.4%
l 2
 
4.4%
Other values (15) 15
33.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 26
57.8%
Space Separator 7
 
15.6%
Uppercase Letter 6
 
13.3%
Other Punctuation 4
 
8.9%
Decimal Number 2
 
4.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3
11.5%
e 3
11.5%
r 3
11.5%
u 3
11.5%
i 2
7.7%
m 2
7.7%
t 2
7.7%
l 2
7.7%
f 1
 
3.8%
o 1
 
3.8%
Other values (4) 4
15.4%
Uppercase Letter
ValueCountFrequency (%)
N 1
16.7%
S 1
16.7%
V 1
16.7%
H 1
16.7%
P 1
16.7%
L 1
16.7%
Other Punctuation
ValueCountFrequency (%)
. 3
75.0%
, 1
 
25.0%
Decimal Number
ValueCountFrequency (%)
0 1
50.0%
9 1
50.0%
Space Separator
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 32
71.1%
Common 13
28.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3
 
9.4%
e 3
 
9.4%
r 3
 
9.4%
u 3
 
9.4%
i 2
 
6.2%
m 2
 
6.2%
t 2
 
6.2%
l 2
 
6.2%
f 1
 
3.1%
o 1
 
3.1%
Other values (10) 10
31.2%
Common
ValueCountFrequency (%)
7
53.8%
. 3
23.1%
, 1
 
7.7%
0 1
 
7.7%
9 1
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 45
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7
15.6%
a 3
 
6.7%
. 3
 
6.7%
e 3
 
6.7%
r 3
 
6.7%
u 3
 
6.7%
i 2
 
4.4%
m 2
 
4.4%
t 2
 
4.4%
l 2
 
4.4%
Other values (15) 15
33.3%

verbatimIdentification
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:13.674866image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters14
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSPECIES
2nd rowSPECIES
ValueCountFrequency (%)
species 2
100.0%
2025-01-08T17:48:13.765205image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 4
28.6%
E 4
28.6%
P 2
14.3%
C 2
14.3%
I 2
14.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 14
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 4
28.6%
E 4
28.6%
P 2
14.3%
C 2
14.3%
I 2
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 14
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 4
28.6%
E 4
28.6%
P 2
14.3%
C 2
14.3%
I 2
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 4
28.6%
E 4
28.6%
P 2
14.3%
C 2
14.3%
I 2
14.3%
Distinct15
Distinct (%)1.0%
Missing603189
Missing (%)99.8%
Memory size4.6 MiB
2025-01-08T17:48:13.815207image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length9
Mean length5.812108559
Min length2

Characters and Unicode

Total characters8352
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st rownear
2nd rowuncertain
3rd rownear
4th rownear
5th rowcf.
ValueCountFrequency (%)
near 466
31.6%
uncertain 459
31.2%
cf 238
16.2%
group 113
 
7.7%
subgroup 80
 
5.4%
complex 26
 
1.8%
aff 21
 
1.4%
sp 21
 
1.4%
n 15
 
1.0%
sensu 11
 
0.7%
Other values (5) 23
 
1.6%
2025-01-08T17:48:13.920525image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 1418
17.0%
r 1131
13.5%
e 962
11.5%
a 947
11.3%
u 743
8.9%
c 732
8.8%
t 481
 
5.8%
i 470
 
5.6%
f 280
 
3.4%
p 240
 
2.9%
Other values (12) 948
11.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8132
97.4%
Other Punctuation 180
 
2.2%
Space Separator 36
 
0.4%
Uppercase Letter 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 1418
17.4%
r 1131
13.9%
e 962
11.8%
a 947
11.6%
u 743
9.1%
c 732
9.0%
t 481
 
5.9%
i 470
 
5.8%
f 280
 
3.4%
p 240
 
3.0%
Other values (8) 728
9.0%
Uppercase Letter
ValueCountFrequency (%)
C 2
50.0%
B 2
50.0%
Other Punctuation
ValueCountFrequency (%)
. 180
100.0%
Space Separator
ValueCountFrequency (%)
36
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8136
97.4%
Common 216
 
2.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 1418
17.4%
r 1131
13.9%
e 962
11.8%
a 947
11.6%
u 743
9.1%
c 732
9.0%
t 481
 
5.9%
i 470
 
5.8%
f 280
 
3.4%
p 240
 
2.9%
Other values (10) 732
9.0%
Common
ValueCountFrequency (%)
. 180
83.3%
36
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8352
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 1418
17.0%
r 1131
13.5%
e 962
11.5%
a 947
11.3%
u 743
8.9%
c 732
8.8%
t 481
 
5.8%
i 470
 
5.6%
f 280
 
3.4%
p 240
 
2.9%
Other values (12) 948
11.4%

typeStatus
Text

Missing 

Distinct11
Distinct (%)< 0.1%
Missing486591
Missing (%)80.5%
Memory size4.6 MiB
2025-01-08T17:48:13.969074image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length8
Mean length6.818274241
Min length4

Characters and Unicode

Total characters804795
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowPARATYPE
2nd rowTYPE
3rd rowHOLOTYPE
4th rowTYPE
5th rowSYNTYPE
ValueCountFrequency (%)
holotype 53956
45.7%
type 32775
27.8%
syntype 13266
 
11.2%
paratype 11028
 
9.3%
lectotype 5190
 
4.4%
allotype 1078
 
0.9%
neotype 315
 
0.3%
cotype 303
 
0.3%
paralectotype 120
 
0.1%
paraneotype 3
 
< 0.1%
2025-01-08T17:48:14.195631image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
Y 131301
16.3%
P 129186
16.1%
E 123664
15.4%
T 123345
15.3%
O 114923
14.3%
L 61424
7.6%
H 53956
6.7%
A 23381
 
2.9%
N 13585
 
1.7%
S 13266
 
1.6%
Other values (2) 16764
 
2.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 804795
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
Y 131301
16.3%
P 129186
16.1%
E 123664
15.4%
T 123345
15.3%
O 114923
14.3%
L 61424
7.6%
H 53956
6.7%
A 23381
 
2.9%
N 13585
 
1.7%
S 13266
 
1.6%
Other values (2) 16764
 
2.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 804795
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
Y 131301
16.3%
P 129186
16.1%
E 123664
15.4%
T 123345
15.3%
O 114923
14.3%
L 61424
7.6%
H 53956
6.7%
A 23381
 
2.9%
N 13585
 
1.7%
S 13266
 
1.6%
Other values (2) 16764
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 804795
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
Y 131301
16.3%
P 129186
16.1%
E 123664
15.4%
T 123345
15.3%
O 114923
14.3%
L 61424
7.6%
H 53956
6.7%
A 23381
 
2.9%
N 13585
 
1.7%
S 13266
 
1.6%
Other values (2) 16764
 
2.1%

identifiedBy
Text

Missing 

Distinct2736
Distinct (%)1.8%
Missing454955
Missing (%)75.2%
Memory size4.6 MiB
2025-01-08T17:48:14.367892image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length150
Median length106
Mean length27.79390129
Min length2

Characters and Unicode

Total characters4159941
Distinct characters71
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique933 ?
Unique (%)0.6%

Sample

1st rowWestfall, M. J., Jr.
2nd rowDonnelly, Thomas W.
3rd rowFlint, Oliver S., Jr., (ENT), Smithsonian Institution - National Museum of Natural History (UNITED STATES)
4th rowKormann, K.
5th rowDeMarmels
ValueCountFrequency (%)
w 28127
 
4.4%
united 24410
 
3.8%
states 24409
 
3.8%
22736
 
3.5%
of 21999
 
3.4%
s 21914
 
3.4%
smithsonian 21909
 
3.4%
institution 21909
 
3.4%
museum 21366
 
3.3%
natural 21088
 
3.3%
Other values (2399) 413039
64.2%
2025-01-08T17:48:14.628708image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
493235
 
11.9%
i 250985
 
6.0%
o 231935
 
5.6%
t 230915
 
5.6%
n 230479
 
5.5%
a 200365
 
4.8%
, 193541
 
4.7%
r 182836
 
4.4%
. 170349
 
4.1%
s 166922
 
4.0%
Other values (61) 1808379
43.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2295249
55.2%
Uppercase Letter 890430
 
21.4%
Space Separator 493235
 
11.9%
Other Punctuation 364740
 
8.8%
Close Punctuation 46598
 
1.1%
Open Punctuation 46598
 
1.1%
Dash Punctuation 23091
 
0.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 250985
10.9%
o 231935
10.1%
t 230915
10.1%
n 230479
10.0%
a 200365
8.7%
r 182836
8.0%
s 166922
7.3%
l 162341
7.1%
e 157389
6.9%
u 114794
 
5.0%
Other values (23) 366288
16.0%
Uppercase Letter
ValueCountFrequency (%)
T 112733
12.7%
S 105388
11.8%
N 90465
10.2%
E 79705
 
9.0%
M 58701
 
6.6%
D 53037
 
6.0%
I 47394
 
5.3%
A 45385
 
5.1%
W 36642
 
4.1%
J 36226
 
4.1%
Other values (16) 224754
25.2%
Other Punctuation
ValueCountFrequency (%)
, 193541
53.1%
. 170349
46.7%
& 690
 
0.2%
' 157
 
< 0.1%
; 2
 
< 0.1%
? 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 46596
> 99.9%
] 2
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 46596
> 99.9%
[ 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
493235
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 23091
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3185679
76.6%
Common 974262
 
23.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 250985
 
7.9%
o 231935
 
7.3%
t 230915
 
7.2%
n 230479
 
7.2%
a 200365
 
6.3%
r 182836
 
5.7%
s 166922
 
5.2%
l 162341
 
5.1%
e 157389
 
4.9%
u 114794
 
3.6%
Other values (49) 1256718
39.4%
Common
ValueCountFrequency (%)
493235
50.6%
, 193541
 
19.9%
. 170349
 
17.5%
) 46596
 
4.8%
( 46596
 
4.8%
- 23091
 
2.4%
& 690
 
0.1%
' 157
 
< 0.1%
[ 2
 
< 0.1%
] 2
 
< 0.1%
Other values (2) 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4159904
> 99.9%
None 37
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
493235
 
11.9%
i 250985
 
6.0%
o 231935
 
5.6%
t 230915
 
5.6%
n 230479
 
5.5%
a 200365
 
4.8%
, 193541
 
4.7%
r 182836
 
4.4%
. 170349
 
4.1%
s 166922
 
4.0%
Other values (54) 1808342
43.5%
None
ValueCountFrequency (%)
á 9
24.3%
ń 9
24.3%
ż 9
24.3%
ö 7
18.9%
ü 1
 
2.7%
è 1
 
2.7%
ä 1
 
2.7%

identifiedByID
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:14.677708image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters16
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowACCEPTED
2nd rowACCEPTED
ValueCountFrequency (%)
accepted 2
100.0%
2025-01-08T17:48:14.760027image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 4
25.0%
E 4
25.0%
A 2
12.5%
P 2
12.5%
T 2
12.5%
D 2
12.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 16
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 4
25.0%
E 4
25.0%
A 2
12.5%
P 2
12.5%
T 2
12.5%
D 2
12.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 16
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 4
25.0%
E 4
25.0%
A 2
12.5%
P 2
12.5%
T 2
12.5%
D 2
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 4
25.0%
E 4
25.0%
A 2
12.5%
P 2
12.5%
T 2
12.5%
D 2
12.5%
Distinct3
Distinct (%)75.0%
Missing604622
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:14.810492image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length22
Mean length21.75
Min length7

Characters and Unicode

Total characters87
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)50.0%

Sample

1st row821cc27a-e3bb-4bc5-ac34-89ada245069d
2nd row27.7731
3rd row821cc27a-e3bb-4bc5-ac34-89ada245069d
4th row-4.55006
ValueCountFrequency (%)
821cc27a-e3bb-4bc5-ac34-89ada245069d 2
50.0%
27.7731 1
25.0%
4.55006 1
25.0%
2025-01-08T17:48:14.909441image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 9
10.3%
c 8
 
9.2%
a 8
 
9.2%
2 7
 
8.0%
4 7
 
8.0%
b 6
 
6.9%
5 6
 
6.9%
3 5
 
5.7%
7 5
 
5.7%
9 4
 
4.6%
Other values (7) 22
25.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 48
55.2%
Lowercase Letter 28
32.2%
Dash Punctuation 9
 
10.3%
Other Punctuation 2
 
2.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 7
14.6%
4 7
14.6%
5 6
12.5%
3 5
10.4%
7 5
10.4%
9 4
8.3%
0 4
8.3%
8 4
8.3%
1 3
6.2%
6 3
6.2%
Lowercase Letter
ValueCountFrequency (%)
c 8
28.6%
a 8
28.6%
b 6
21.4%
d 4
14.3%
e 2
 
7.1%
Dash Punctuation
ValueCountFrequency (%)
- 9
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 59
67.8%
Latin 28
32.2%

Most frequent character per script

Common
ValueCountFrequency (%)
- 9
15.3%
2 7
11.9%
4 7
11.9%
5 6
10.2%
3 5
8.5%
7 5
8.5%
9 4
6.8%
0 4
6.8%
8 4
6.8%
1 3
 
5.1%
Other values (2) 5
8.5%
Latin
ValueCountFrequency (%)
c 8
28.6%
a 8
28.6%
b 6
21.4%
d 4
14.3%
e 2
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 87
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 9
10.3%
c 8
 
9.2%
a 8
 
9.2%
2 7
 
8.0%
4 7
 
8.0%
b 6
 
6.9%
5 6
 
6.9%
3 5
 
5.7%
7 5
 
5.7%
9 4
 
4.6%
Other values (7) 22
25.3%

identificationRemarks
Text

Missing 

Distinct3
Distinct (%)75.0%
Missing604622
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:14.951741image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length4.5
Min length2

Characters and Unicode

Total characters18
Distinct characters10
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)50.0%

Sample

1st rowUS
2nd row-82.64
3rd rowUS
4th row-76.1874
ValueCountFrequency (%)
us 2
50.0%
82.64 1
25.0%
76.1874 1
25.0%
2025-01-08T17:48:15.042003image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 2
11.1%
S 2
11.1%
- 2
11.1%
8 2
11.1%
. 2
11.1%
6 2
11.1%
4 2
11.1%
7 2
11.1%
2 1
5.6%
1 1
5.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10
55.6%
Uppercase Letter 4
 
22.2%
Dash Punctuation 2
 
11.1%
Other Punctuation 2
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8 2
20.0%
6 2
20.0%
4 2
20.0%
7 2
20.0%
2 1
10.0%
1 1
10.0%
Uppercase Letter
ValueCountFrequency (%)
U 2
50.0%
S 2
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 14
77.8%
Latin 4
 
22.2%

Most frequent character per script

Common
ValueCountFrequency (%)
- 2
14.3%
8 2
14.3%
. 2
14.3%
6 2
14.3%
4 2
14.3%
7 2
14.3%
2 1
7.1%
1 1
7.1%
Latin
ValueCountFrequency (%)
U 2
50.0%
S 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 2
11.1%
S 2
11.1%
- 2
11.1%
8 2
11.1%
. 2
11.1%
6 2
11.1%
4 2
11.1%
7 2
11.1%
2 1
5.6%
1 1
5.6%

taxonID
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:15.091644image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters48
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row2024-12-02T13:59:16.382Z
2nd row2024-12-02T13:59:48.546Z
ValueCountFrequency (%)
2024-12-02t13:59:16.382z 1
50.0%
2024-12-02t13:59:48.546z 1
50.0%
2025-01-08T17:48:15.189715image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 9
18.8%
1 5
10.4%
0 4
8.3%
4 4
8.3%
- 4
8.3%
: 4
8.3%
3 3
 
6.2%
5 3
 
6.2%
T 2
 
4.2%
9 2
 
4.2%
Other values (4) 8
16.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 34
70.8%
Other Punctuation 6
 
12.5%
Dash Punctuation 4
 
8.3%
Uppercase Letter 4
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 9
26.5%
1 5
14.7%
0 4
11.8%
4 4
11.8%
3 3
 
8.8%
5 3
 
8.8%
9 2
 
5.9%
6 2
 
5.9%
8 2
 
5.9%
Other Punctuation
ValueCountFrequency (%)
: 4
66.7%
. 2
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 2
50.0%
Z 2
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 44
91.7%
Latin 4
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
2 9
20.5%
1 5
11.4%
0 4
9.1%
4 4
9.1%
- 4
9.1%
: 4
9.1%
3 3
 
6.8%
5 3
 
6.8%
9 2
 
4.5%
6 2
 
4.5%
Other values (2) 4
9.1%
Latin
ValueCountFrequency (%)
T 2
50.0%
Z 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 48
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 9
18.8%
1 5
10.4%
0 4
8.3%
4 4
8.3%
- 4
8.3%
: 4
8.3%
3 3
 
6.2%
5 3
 
6.2%
T 2
 
4.2%
9 2
 
4.2%
Other values (4) 8
16.7%
Distinct188378
Distinct (%)31.4%
Missing4648
Missing (%)0.8%
Memory size4.6 MiB
2025-01-08T17:48:15.420605image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.955165023
Min length1

Characters and Unicode

Total characters4172946
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique134600 ?
Unique (%)22.4%

Sample

1st row7866975
2nd row5122189
3rd row1939887
4th row1422444
5th row4988370
ValueCountFrequency (%)
1340278 10672
 
1.8%
1340525 6265
 
1.0%
1340393 4073
 
0.7%
10409744 3623
 
0.6%
789 3466
 
0.6%
1340467 3343
 
0.6%
9164 3176
 
0.5%
1340350 3129
 
0.5%
1341979 2431
 
0.4%
1340485 2119
 
0.4%
Other values (188368) 557681
93.0%
2025-01-08T17:48:15.726471image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 709890
17.0%
4 525521
12.6%
0 431132
10.3%
2 418685
10.0%
3 411620
9.9%
5 382598
9.2%
8 332590
8.0%
7 330905
7.9%
9 329832
7.9%
6 300173
7.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4172946
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 709890
17.0%
4 525521
12.6%
0 431132
10.3%
2 418685
10.0%
3 411620
9.9%
5 382598
9.2%
8 332590
8.0%
7 330905
7.9%
9 329832
7.9%
6 300173
7.2%

Most occurring scripts

ValueCountFrequency (%)
Common 4172946
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 709890
17.0%
4 525521
12.6%
0 431132
10.3%
2 418685
10.0%
3 411620
9.9%
5 382598
9.2%
8 332590
8.0%
7 330905
7.9%
9 329832
7.9%
6 300173
7.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4172946
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 709890
17.0%
4 525521
12.6%
0 431132
10.3%
2 418685
10.0%
3 411620
9.9%
5 382598
9.2%
8 332590
8.0%
7 330905
7.9%
9 329832
7.9%
6 300173
7.2%

namePublishedInID
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:15.793215image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length112
Median length80
Mean length80
Min length48

Characters and Unicode

Total characters160
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
2nd rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;GEODETIC_DATUM_ASSUMED_WGS84;CONTINENT_DERIVED_FROM_COORDINATES
ValueCountFrequency (%)
occurrence_status_inferred_from_individual_count 1
50.0%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;continent_derived_from_coordinates 1
50.0%
2025-01-08T17:48:15.902084image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 16
10.0%
E 15
9.4%
R 13
 
8.1%
I 12
 
7.5%
D 12
 
7.5%
N 12
 
7.5%
T 11
 
6.9%
C 11
 
6.9%
O 11
 
6.9%
U 10
 
6.2%
Other values (11) 37
23.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 140
87.5%
Connector Punctuation 16
 
10.0%
Other Punctuation 2
 
1.2%
Decimal Number 2
 
1.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 15
10.7%
R 13
9.3%
I 12
8.6%
D 12
8.6%
N 12
8.6%
T 11
7.9%
C 11
7.9%
O 11
7.9%
U 10
 
7.1%
S 8
 
5.7%
Other values (7) 25
17.9%
Decimal Number
ValueCountFrequency (%)
8 1
50.0%
4 1
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 16
100.0%
Other Punctuation
ValueCountFrequency (%)
; 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 140
87.5%
Common 20
 
12.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 15
10.7%
R 13
9.3%
I 12
8.6%
D 12
8.6%
N 12
8.6%
T 11
7.9%
C 11
7.9%
O 11
7.9%
U 10
 
7.1%
S 8
 
5.7%
Other values (7) 25
17.9%
Common
ValueCountFrequency (%)
_ 16
80.0%
; 2
 
10.0%
8 1
 
5.0%
4 1
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 160
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_ 16
10.0%
E 15
9.4%
R 13
 
8.1%
I 12
 
7.5%
D 12
 
7.5%
N 12
 
7.5%
T 11
 
6.9%
C 11
 
6.9%
O 11
 
6.9%
U 10
 
6.2%
Other values (11) 37
23.1%

taxonConceptID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604625
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:15.941938image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters10
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowStillImage
ValueCountFrequency (%)
stillimage 1
100.0%
2025-01-08T17:48:16.029934image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 2
20.0%
S 1
10.0%
t 1
10.0%
i 1
10.0%
I 1
10.0%
m 1
10.0%
a 1
10.0%
g 1
10.0%
e 1
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8
80.0%
Uppercase Letter 2
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 2
25.0%
t 1
12.5%
i 1
12.5%
m 1
12.5%
a 1
12.5%
g 1
12.5%
e 1
12.5%
Uppercase Letter
ValueCountFrequency (%)
S 1
50.0%
I 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 2
20.0%
S 1
10.0%
t 1
10.0%
i 1
10.0%
I 1
10.0%
m 1
10.0%
a 1
10.0%
g 1
10.0%
e 1
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 2
20.0%
S 1
10.0%
t 1
10.0%
i 1
10.0%
I 1
10.0%
m 1
10.0%
a 1
10.0%
g 1
10.0%
e 1
10.0%
Distinct203338
Distinct (%)33.6%
Missing2
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-08T17:48:16.239032image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length239
Median length108
Mean length31.31866747
Min length4

Characters and Unicode

Total characters18936018
Distinct characters109
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique154758 ?
Unique (%)25.6%

Sample

1st rowCamponotus rufoglaucus var. rufigenis Forel
2nd rowAthrips mesoleuca Lower, 1900
3rd rowParanthrene asilipennis (Boisduval, 1832)
4th rowAcanthagrion trilobatum Leonard, 1977
5th rowCalathus nanulus Casey, 1920
ValueCountFrequency (%)
bombus 62365
 
2.7%
29343
 
1.3%
hagen 24881
 
1.1%
cresson 24121
 
1.0%
1861 19352
 
0.8%
fabricius 16608
 
0.7%
1863 16510
 
0.7%
selys 15944
 
0.7%
casey 15917
 
0.7%
latreille 15270
 
0.7%
Other values (119252) 2103492
89.7%
2025-01-08T17:48:16.524724image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1739179
 
9.2%
a 1483659
 
7.8%
e 1211161
 
6.4%
i 1149549
 
6.1%
s 1059970
 
5.6%
r 959396
 
5.1%
o 891386
 
4.7%
l 793140
 
4.2%
n 766475
 
4.0%
1 670618
 
3.5%
Other values (99) 8211485
43.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12719422
67.2%
Decimal Number 2282464
 
12.1%
Space Separator 1739179
 
9.2%
Uppercase Letter 1240430
 
6.6%
Other Punctuation 615505
 
3.3%
Close Punctuation 167008
 
0.9%
Open Punctuation 167008
 
0.9%
Dash Punctuation 5002
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1483659
11.7%
e 1211161
 
9.5%
i 1149549
 
9.0%
s 1059970
 
8.3%
r 959396
 
7.5%
o 891386
 
7.0%
l 793140
 
6.2%
n 766475
 
6.0%
u 640369
 
5.0%
t 634331
 
5.0%
Other values (47) 3129986
24.6%
Uppercase Letter
ValueCountFrequency (%)
C 149154
12.0%
B 127924
 
10.3%
S 115986
 
9.4%
P 89895
 
7.2%
A 87131
 
7.0%
H 83287
 
6.7%
L 83192
 
6.7%
M 63745
 
5.1%
D 56317
 
4.5%
E 50255
 
4.1%
Other values (23) 333544
26.9%
Decimal Number
ValueCountFrequency (%)
1 670618
29.4%
8 384509
16.8%
9 327352
14.3%
7 165872
 
7.3%
0 131673
 
5.8%
6 131257
 
5.8%
3 128731
 
5.6%
2 126306
 
5.5%
5 111567
 
4.9%
4 104579
 
4.6%
Other Punctuation
ValueCountFrequency (%)
, 572531
93.0%
& 29332
 
4.8%
. 13436
 
2.2%
' 195
 
< 0.1%
? 11
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1739179
100.0%
Close Punctuation
ValueCountFrequency (%)
) 167008
100.0%
Open Punctuation
ValueCountFrequency (%)
( 167008
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5002
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 13959852
73.7%
Common 4976166
 
26.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1483659
 
10.6%
e 1211161
 
8.7%
i 1149549
 
8.2%
s 1059970
 
7.6%
r 959396
 
6.9%
o 891386
 
6.4%
l 793140
 
5.7%
n 766475
 
5.5%
u 640369
 
4.6%
t 634331
 
4.5%
Other values (80) 4370416
31.3%
Common
ValueCountFrequency (%)
1739179
35.0%
1 670618
 
13.5%
, 572531
 
11.5%
8 384509
 
7.7%
9 327352
 
6.6%
) 167008
 
3.4%
( 167008
 
3.4%
7 165872
 
3.3%
0 131673
 
2.6%
6 131257
 
2.6%
Other values (9) 519159
 
10.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18911224
99.9%
None 24794
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1739179
 
9.2%
a 1483659
 
7.8%
e 1211161
 
6.4%
i 1149549
 
6.1%
s 1059970
 
5.6%
r 959396
 
5.1%
o 891386
 
4.7%
l 793140
 
4.2%
n 766475
 
4.1%
1 670618
 
3.5%
Other values (61) 8186691
43.3%
None
ValueCountFrequency (%)
é 9434
38.0%
ü 5136
20.7%
ö 3299
 
13.3%
å 1792
 
7.2%
á 1363
 
5.5%
ä 1221
 
4.9%
ç 858
 
3.5%
è 790
 
3.2%
ó 219
 
0.9%
í 140
 
0.6%
Other values (28) 542
 
2.2%

acceptedNameUsage
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:16.575374image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters10
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
ValueCountFrequency (%)
false 2
100.0%
2025-01-08T17:48:16.660183image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
f 2
20.0%
a 2
20.0%
l 2
20.0%
s 2
20.0%
e 2
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
f 2
20.0%
a 2
20.0%
l 2
20.0%
s 2
20.0%
e 2
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
f 2
20.0%
a 2
20.0%
l 2
20.0%
s 2
20.0%
e 2
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f 2
20.0%
a 2
20.0%
l 2
20.0%
s 2
20.0%
e 2
20.0%

parentNameUsage
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing604623
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:16.701801image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length7
Mean length8.666666667
Min length7

Characters and Unicode

Total characters26
Distinct characters18
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st row1937011
2nd rowGoogle Earth
3rd row1424710
ValueCountFrequency (%)
1937011 1
25.0%
google 1
25.0%
earth 1
25.0%
1424710 1
25.0%
2025-01-08T17:48:16.797693image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 5
19.2%
4 2
 
7.7%
7 2
 
7.7%
0 2
 
7.7%
o 2
 
7.7%
E 1
 
3.8%
h 1
 
3.8%
t 1
 
3.8%
r 1
 
3.8%
a 1
 
3.8%
Other values (8) 8
30.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 14
53.8%
Lowercase Letter 9
34.6%
Uppercase Letter 2
 
7.7%
Space Separator 1
 
3.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 2
22.2%
h 1
11.1%
t 1
11.1%
r 1
11.1%
a 1
11.1%
e 1
11.1%
l 1
11.1%
g 1
11.1%
Decimal Number
ValueCountFrequency (%)
1 5
35.7%
4 2
 
14.3%
7 2
 
14.3%
0 2
 
14.3%
9 1
 
7.1%
3 1
 
7.1%
2 1
 
7.1%
Uppercase Letter
ValueCountFrequency (%)
E 1
50.0%
G 1
50.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 15
57.7%
Latin 11
42.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 2
18.2%
E 1
9.1%
h 1
9.1%
t 1
9.1%
r 1
9.1%
a 1
9.1%
e 1
9.1%
l 1
9.1%
g 1
9.1%
G 1
9.1%
Common
ValueCountFrequency (%)
1 5
33.3%
4 2
 
13.3%
7 2
 
13.3%
0 2
 
13.3%
1
 
6.7%
9 1
 
6.7%
3 1
 
6.7%
2 1
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 5
19.2%
4 2
 
7.7%
7 2
 
7.7%
0 2
 
7.7%
o 2
 
7.7%
E 1
 
3.8%
h 1
 
3.8%
t 1
 
3.8%
r 1
 
3.8%
a 1
 
3.8%
Other values (8) 8
30.8%

originalNameUsage
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:16.838912image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters14
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row1937011
2nd row1424710
ValueCountFrequency (%)
1937011 1
50.0%
1424710 1
50.0%
2025-01-08T17:48:16.924786image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 5
35.7%
7 2
 
14.3%
0 2
 
14.3%
4 2
 
14.3%
9 1
 
7.1%
3 1
 
7.1%
2 1
 
7.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 14
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 5
35.7%
7 2
 
14.3%
0 2
 
14.3%
4 2
 
14.3%
9 1
 
7.1%
3 1
 
7.1%
2 1
 
7.1%

Most occurring scripts

ValueCountFrequency (%)
Common 14
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 5
35.7%
7 2
 
14.3%
0 2
 
14.3%
4 2
 
14.3%
9 1
 
7.1%
3 1
 
7.1%
2 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 5
35.7%
7 2
 
14.3%
0 2
 
14.3%
4 2
 
14.3%
9 1
 
7.1%
3 1
 
7.1%
2 1
 
7.1%

nameAccordingTo
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:16.963327image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
ValueCountFrequency (%)
1 2
100.0%
2025-01-08T17:48:17.053073image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2
100.0%

namePublishedIn
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:17.091071image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters4
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row54
2nd row54
ValueCountFrequency (%)
54 2
100.0%
2025-01-08T17:48:17.176534image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 2
50.0%
4 2
50.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 2
50.0%
4 2
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 2
50.0%
4 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 2
50.0%
4 2
50.0%

namePublishedInYear
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:17.213156image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters6
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row216
2nd row216
ValueCountFrequency (%)
216 2
100.0%
2025-01-08T17:48:17.295376image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2
33.3%
1 2
33.3%
6 2
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2
33.3%
1 2
33.3%
6 2
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 6
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2
33.3%
1 2
33.3%
6 2
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2
33.3%
1 2
33.3%
6 2
33.3%
Distinct3456
Distinct (%)0.6%
Missing4647
Missing (%)0.8%
Memory size4.6 MiB
2025-01-08T17:48:17.433839image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length97
Median length91
Mean length62.39093868
Min length3

Characters and Unicode

Total characters37433253
Distinct characters63
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique577 ?
Unique (%)0.1%

Sample

1st rowAnimalia, Arthropoda, Insecta, Hymenoptera, Formicidae, Formicinae
2nd rowAnimalia, Arthropoda, Insecta, Lepidoptera, Gelechiidae, Gelechiinae
3rd rowAnimalia, Arthropoda, Insecta, Lepidoptera, Sesiidae, Sesiinae
4th rowAnimalia, Arthropoda, Insecta, Odonata, Zygoptera, Coenagrionidae
5th rowAnimalia, Arthropoda, Insecta, Coleoptera, Carabidae
ValueCountFrequency (%)
arthropoda 599697
17.3%
animalia 598328
17.3%
insecta 587915
17.0%
hymenoptera 146500
 
4.2%
odonata 117281
 
3.4%
lepidoptera 99941
 
2.9%
apidae 82932
 
2.4%
diptera 73535
 
2.1%
coleoptera 72078
 
2.1%
apinae 63521
 
1.8%
Other values (2938) 1026036
29.6%
2025-01-08T17:48:17.657695image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4570771
12.2%
e 2938293
 
7.8%
2867785
 
7.7%
, 2867419
 
7.7%
i 2865053
 
7.7%
o 2432895
 
6.5%
r 2316840
 
6.2%
t 2192044
 
5.9%
n 2160053
 
5.8%
p 1690137
 
4.5%
Other values (53) 10531963
28.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 28230630
75.4%
Uppercase Letter 3467330
 
9.3%
Space Separator 2867785
 
7.7%
Other Punctuation 2867488
 
7.7%
Decimal Number 16
 
< 0.1%
Connector Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4570771
16.2%
e 2938293
10.4%
i 2865053
10.1%
o 2432895
8.6%
r 2316840
8.2%
t 2192044
7.8%
n 2160053
7.7%
p 1690137
 
6.0%
d 1537742
 
5.4%
l 1127926
 
4.0%
Other values (16) 4398876
15.6%
Uppercase Letter
ValueCountFrequency (%)
A 1474018
42.5%
I 598175
17.3%
C 245213
 
7.1%
H 231711
 
6.7%
L 182525
 
5.3%
O 125491
 
3.6%
P 113908
 
3.3%
D 95369
 
2.8%
S 80616
 
2.3%
Z 57602
 
1.7%
Other values (15) 262702
 
7.6%
Decimal Number
ValueCountFrequency (%)
6 3
18.8%
7 3
18.8%
9 3
18.8%
0 2
12.5%
1 2
12.5%
3 2
12.5%
8 1
 
6.2%
Other Punctuation
ValueCountFrequency (%)
, 2867419
> 99.9%
? 39
 
< 0.1%
/ 30
 
< 0.1%
Space Separator
ValueCountFrequency (%)
2867785
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 31697960
84.7%
Common 5735293
 
15.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4570771
14.4%
e 2938293
 
9.3%
i 2865053
 
9.0%
o 2432895
 
7.7%
r 2316840
 
7.3%
t 2192044
 
6.9%
n 2160053
 
6.8%
p 1690137
 
5.3%
d 1537742
 
4.9%
A 1474018
 
4.7%
Other values (41) 7520114
23.7%
Common
ValueCountFrequency (%)
2867785
50.0%
, 2867419
50.0%
? 39
 
< 0.1%
/ 30
 
< 0.1%
_ 4
 
< 0.1%
6 3
 
< 0.1%
7 3
 
< 0.1%
9 3
 
< 0.1%
0 2
 
< 0.1%
1 2
 
< 0.1%
Other values (2) 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 37433253
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4570771
12.2%
e 2938293
 
7.8%
2867785
 
7.7%
, 2867419
 
7.7%
i 2865053
 
7.7%
o 2432895
 
6.5%
r 2316840
 
6.2%
t 2192044
 
5.9%
n 2160053
 
5.8%
p 1690137
 
4.5%
Other values (53) 10531963
28.1%
Distinct4
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-08T17:48:17.709323image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length8
Mean length8.046071608
Min length4

Characters and Unicode

Total characters4864848
Distinct characters19
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowAnimalia
2nd rowAnimalia
3rd rowAnimalia
4th rowAnimalia
5th rowAnimalia
ValueCountFrequency (%)
animalia 599978
98.5%
incertae 4644
 
0.8%
sedis 4644
 
0.8%
9417 1
 
< 0.1%
4209 1
 
< 0.1%
2025-01-08T17:48:17.805299image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 1209244
24.9%
a 1204600
24.8%
n 604622
12.4%
A 599978
12.3%
m 599978
12.3%
l 599978
12.3%
e 13932
 
0.3%
s 9288
 
0.2%
4644
 
0.1%
d 4644
 
0.1%
Other values (9) 13940
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4260218
87.6%
Uppercase Letter 599978
 
12.3%
Space Separator 4644
 
0.1%
Decimal Number 8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 1209244
28.4%
a 1204600
28.3%
n 604622
14.2%
m 599978
14.1%
l 599978
14.1%
e 13932
 
0.3%
s 9288
 
0.2%
d 4644
 
0.1%
t 4644
 
0.1%
r 4644
 
0.1%
Decimal Number
ValueCountFrequency (%)
9 2
25.0%
4 2
25.0%
1 1
12.5%
7 1
12.5%
2 1
12.5%
0 1
12.5%
Uppercase Letter
ValueCountFrequency (%)
A 599978
100.0%
Space Separator
ValueCountFrequency (%)
4644
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4860196
99.9%
Common 4652
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 1209244
24.9%
a 1204600
24.8%
n 604622
12.4%
A 599978
12.3%
m 599978
12.3%
l 599978
12.3%
e 13932
 
0.3%
s 9288
 
0.2%
d 4644
 
0.1%
t 4644
 
0.1%
Other values (2) 9288
 
0.2%
Common
ValueCountFrequency (%)
4644
99.8%
9 2
 
< 0.1%
4 2
 
< 0.1%
1 1
 
< 0.1%
7 1
 
< 0.1%
2 1
 
< 0.1%
0 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4864848
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 1209244
24.9%
a 1204600
24.8%
n 604622
12.4%
A 599978
12.3%
m 599978
12.3%
l 599978
12.3%
e 13932
 
0.3%
s 9288
 
0.2%
4644
 
0.1%
d 4644
 
0.1%
Other values (9) 13940
 
0.3%

phylum
Text

Distinct9
Distinct (%)< 0.1%
Missing5245
Missing (%)0.9%
Memory size4.6 MiB
2025-01-08T17:48:17.853041image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length10
Mean length9.999918249
Min length7

Characters and Unicode

Total characters5993761
Distinct characters30
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowArthropoda
2nd rowArthropoda
3rd rowArthropoda
4th rowArthropoda
5th rowArthropoda
ValueCountFrequency (%)
arthropoda 599346
> 99.9%
cnidaria 18
 
< 0.1%
onychophora 6
 
< 0.1%
mollusca 5
 
< 0.1%
chordata 2
 
< 0.1%
1936987 1
 
< 0.1%
nemertea 1
 
< 0.1%
1424684 1
 
< 0.1%
echinodermata 1
 
< 0.1%
2025-01-08T17:48:17.958731image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 1198720
20.0%
o 1198712
20.0%
a 599400
10.0%
d 599367
10.0%
h 599361
10.0%
p 599352
10.0%
t 599350
10.0%
A 599346
10.0%
i 37
 
< 0.1%
n 25
 
< 0.1%
Other values (20) 91
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5394368
90.0%
Uppercase Letter 599379
 
10.0%
Decimal Number 14
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 1198720
22.2%
o 1198712
22.2%
a 599400
11.1%
d 599367
11.1%
h 599361
11.1%
p 599352
11.1%
t 599350
11.1%
i 37
 
< 0.1%
n 25
 
< 0.1%
c 12
 
< 0.1%
Other values (6) 32
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
4 3
21.4%
8 2
14.3%
6 2
14.3%
9 2
14.3%
1 2
14.3%
7 1
 
7.1%
3 1
 
7.1%
2 1
 
7.1%
Uppercase Letter
ValueCountFrequency (%)
A 599346
> 99.9%
C 20
 
< 0.1%
O 6
 
< 0.1%
M 5
 
< 0.1%
N 1
 
< 0.1%
E 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 5993747
> 99.9%
Common 14
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 1198720
20.0%
o 1198712
20.0%
a 599400
10.0%
d 599367
10.0%
h 599361
10.0%
p 599352
10.0%
t 599350
10.0%
A 599346
10.0%
i 37
 
< 0.1%
n 25
 
< 0.1%
Other values (12) 77
 
< 0.1%
Common
ValueCountFrequency (%)
4 3
21.4%
8 2
14.3%
6 2
14.3%
9 2
14.3%
1 2
14.3%
7 1
 
7.1%
3 1
 
7.1%
2 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5993761
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 1198720
20.0%
o 1198712
20.0%
a 599400
10.0%
d 599367
10.0%
h 599361
10.0%
p 599352
10.0%
t 599350
10.0%
A 599346
10.0%
i 37
 
< 0.1%
n 25
 
< 0.1%
Other values (20) 91
 
< 0.1%

class
Text

Distinct13
Distinct (%)< 0.1%
Missing5283
Missing (%)0.9%
Memory size4.6 MiB
2025-01-08T17:48:18.005731image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length7
Mean length7.038410393
Min length7

Characters and Unicode

Total characters4218422
Distinct characters26
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowInsecta
2nd rowInsecta
3rd rowInsecta
4th rowInsecta
5th rowInsecta
ValueCountFrequency (%)
insecta 588111
98.1%
arachnida 7917
 
1.3%
diplopoda 1599
 
0.3%
collembola 820
 
0.1%
chilopoda 736
 
0.1%
diplura 77
 
< 0.1%
protura 62
 
< 0.1%
symphyla 8
 
< 0.1%
malacostraca 5
 
< 0.1%
pauropoda 4
 
< 0.1%
Other values (3) 4
 
< 0.1%
2025-01-08T17:48:18.108507image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 607282
14.4%
c 596040
14.1%
n 596030
14.1%
e 588932
14.0%
t 588180
13.9%
s 588119
13.9%
I 588111
13.9%
i 10333
 
0.2%
d 10259
 
0.2%
h 8663
 
0.2%
Other values (16) 36473
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3619079
85.8%
Uppercase Letter 599343
 
14.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 607282
16.8%
c 596040
16.5%
n 596030
16.5%
e 588932
16.3%
t 588180
16.3%
s 588119
16.3%
i 10333
 
0.3%
d 10259
 
0.3%
h 8663
 
0.2%
r 8130
 
0.2%
Other values (7) 17111
 
0.5%
Uppercase Letter
ValueCountFrequency (%)
I 588111
98.1%
A 7917
 
1.3%
D 1676
 
0.3%
C 1556
 
0.3%
P 66
 
< 0.1%
S 8
 
< 0.1%
M 5
 
< 0.1%
G 2
 
< 0.1%
E 2
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 4218422
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 607282
14.4%
c 596040
14.1%
n 596030
14.1%
e 588932
14.0%
t 588180
13.9%
s 588119
13.9%
I 588111
13.9%
i 10333
 
0.2%
d 10259
 
0.2%
h 8663
 
0.2%
Other values (16) 36473
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4218422
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 607282
14.4%
c 596040
14.1%
n 596030
14.1%
e 588932
14.0%
t 588180
13.9%
s 588119
13.9%
I 588111
13.9%
i 10333
 
0.2%
d 10259
 
0.2%
h 8663
 
0.2%
Other values (16) 36473
 
0.9%

order
Text

Distinct74
Distinct (%)< 0.1%
Missing5577
Missing (%)0.9%
Memory size4.6 MiB
2025-01-08T17:48:18.171049image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length16
Mean length9.451483935
Min length6

Characters and Unicode

Total characters5661902
Distinct characters47
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)< 0.1%

Sample

1st rowHymenoptera
2nd rowLepidoptera
3rd rowLepidoptera
4th rowOdonata
5th rowColeoptera
ValueCountFrequency (%)
hymenoptera 146330
24.4%
odonata 117284
19.6%
lepidoptera 99491
16.6%
diptera 73566
12.3%
coleoptera 71961
12.0%
hemiptera 37757
 
6.3%
siphonaptera 10087
 
1.7%
trichoptera 9104
 
1.5%
thysanoptera 4628
 
0.8%
araneae 4624
 
0.8%
Other values (64) 24217
 
4.0%
2025-01-08T17:48:18.290045image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 849733
15.0%
a 742019
13.1%
t 591304
10.4%
p 577797
10.2%
o 563778
10.0%
r 489265
8.6%
n 284180
 
5.0%
i 238281
 
4.2%
d 228781
 
4.0%
m 192363
 
3.4%
Other values (37) 904401
16.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5062841
89.4%
Uppercase Letter 599047
 
10.6%
Decimal Number 14
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 849733
16.8%
a 742019
14.7%
t 591304
11.7%
p 577797
11.4%
o 563778
11.1%
r 489265
9.7%
n 284180
 
5.6%
i 238281
 
4.7%
d 228781
 
4.5%
m 192363
 
3.8%
Other values (12) 305340
 
6.0%
Uppercase Letter
ValueCountFrequency (%)
H 184087
30.7%
O 118498
19.8%
L 99813
16.7%
D 73747
12.3%
C 72099
 
12.0%
T 15063
 
2.5%
S 11901
 
2.0%
P 7210
 
1.2%
M 4867
 
0.8%
A 4742
 
0.8%
Other values (8) 7020
 
1.2%
Decimal Number
ValueCountFrequency (%)
1 5
35.7%
7 2
 
14.3%
0 2
 
14.3%
4 2
 
14.3%
9 1
 
7.1%
3 1
 
7.1%
2 1
 
7.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 5661888
> 99.9%
Common 14
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 849733
15.0%
a 742019
13.1%
t 591304
10.4%
p 577797
10.2%
o 563778
10.0%
r 489265
8.6%
n 284180
 
5.0%
i 238281
 
4.2%
d 228781
 
4.0%
m 192363
 
3.4%
Other values (30) 904387
16.0%
Common
ValueCountFrequency (%)
1 5
35.7%
7 2
 
14.3%
0 2
 
14.3%
4 2
 
14.3%
9 1
 
7.1%
3 1
 
7.1%
2 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5661902
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 849733
15.0%
a 742019
13.1%
t 591304
10.4%
p 577797
10.2%
o 563778
10.0%
r 489265
8.6%
n 284180
 
5.0%
i 238281
 
4.2%
d 228781
 
4.0%
m 192363
 
3.4%
Other values (37) 904401
16.0%

superfamily
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:18.342465image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length19.5
Mean length19.5
Min length17

Characters and Unicode

Total characters39
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowTroides amphrysus
2nd rowGynacantha membranalis
ValueCountFrequency (%)
troides 1
25.0%
amphrysus 1
25.0%
gynacantha 1
25.0%
membranalis 1
25.0%
2025-01-08T17:48:18.438174image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 6
15.4%
s 4
 
10.3%
n 3
 
7.7%
m 3
 
7.7%
r 3
 
7.7%
i 2
 
5.1%
e 2
 
5.1%
2
 
5.1%
h 2
 
5.1%
y 2
 
5.1%
Other values (10) 10
25.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 35
89.7%
Space Separator 2
 
5.1%
Uppercase Letter 2
 
5.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 6
17.1%
s 4
11.4%
n 3
8.6%
m 3
8.6%
r 3
8.6%
i 2
 
5.7%
e 2
 
5.7%
h 2
 
5.7%
y 2
 
5.7%
b 1
 
2.9%
Other values (7) 7
20.0%
Uppercase Letter
ValueCountFrequency (%)
T 1
50.0%
G 1
50.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 37
94.9%
Common 2
 
5.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 6
16.2%
s 4
10.8%
n 3
 
8.1%
m 3
 
8.1%
r 3
 
8.1%
i 2
 
5.4%
e 2
 
5.4%
h 2
 
5.4%
y 2
 
5.4%
T 1
 
2.7%
Other values (9) 9
24.3%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 39
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 6
15.4%
s 4
 
10.3%
n 3
 
7.7%
m 3
 
7.7%
r 3
 
7.7%
i 2
 
5.1%
e 2
 
5.1%
2
 
5.1%
h 2
 
5.1%
y 2
 
5.1%
Other values (10) 10
25.6%

family
Text

Missing 

Distinct1494
Distinct (%)0.3%
Missing11642
Missing (%)1.9%
Memory size4.6 MiB
2025-01-08T17:48:18.586803image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length35
Median length21
Mean length10.49803873
Min length6

Characters and Unicode

Total characters6225169
Distinct characters60
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique196 ?
Unique (%)< 0.1%

Sample

1st rowFormicidae
2nd rowGelechiidae
3rd rowSesiidae
4th rowCoenagrionidae
5th rowCarabidae
ValueCountFrequency (%)
apidae 82646
 
13.9%
libellulidae 42503
 
7.2%
coenagrionidae 36255
 
6.1%
chrysomelidae 17448
 
2.9%
crambidae 13614
 
2.3%
asilidae 13374
 
2.3%
geometridae 12793
 
2.2%
psychodidae 11788
 
2.0%
curculionidae 11689
 
2.0%
formicidae 9878
 
1.7%
Other values (1490) 341002
57.5%
2025-01-08T17:48:18.796563image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 902858
14.5%
e 876852
14.1%
a 817696
13.1%
d 657115
10.6%
o 322939
 
5.2%
l 317031
 
5.1%
r 285013
 
4.6%
p 208767
 
3.4%
n 202426
 
3.3%
h 150237
 
2.4%
Other values (50) 1484235
23.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5632165
90.5%
Uppercase Letter 592986
 
9.5%
Decimal Number 8
 
< 0.1%
Space Separator 6
 
< 0.1%
Other Punctuation 2
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 902858
16.0%
e 876852
15.6%
a 817696
14.5%
d 657115
11.7%
o 322939
 
5.7%
l 317031
 
5.6%
r 285013
 
5.1%
p 208767
 
3.7%
n 202426
 
3.6%
h 150237
 
2.7%
Other values (16) 891231
15.8%
Uppercase Letter
ValueCountFrequency (%)
C 138628
23.4%
A 122946
20.7%
L 65061
11.0%
P 58656
9.9%
T 31926
 
5.4%
S 31919
 
5.4%
G 26736
 
4.5%
E 18009
 
3.0%
M 16962
 
2.9%
N 16555
 
2.8%
Other values (16) 65588
11.1%
Decimal Number
ValueCountFrequency (%)
1 3
37.5%
9 2
25.0%
7 2
25.0%
8 1
 
12.5%
Space Separator
ValueCountFrequency (%)
6
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6225151
> 99.9%
Common 18
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 902858
14.5%
e 876852
14.1%
a 817696
13.1%
d 657115
10.6%
o 322939
 
5.2%
l 317031
 
5.1%
r 285013
 
4.6%
p 208767
 
3.4%
n 202426
 
3.3%
h 150237
 
2.4%
Other values (42) 1484217
23.8%
Common
ValueCountFrequency (%)
6
33.3%
1 3
16.7%
, 2
 
11.1%
9 2
 
11.1%
7 2
 
11.1%
8 1
 
5.6%
( 1
 
5.6%
) 1
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6225169
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 902858
14.5%
e 876852
14.1%
a 817696
13.1%
d 657115
10.6%
o 322939
 
5.2%
l 317031
 
5.1%
r 285013
 
4.6%
p 208767
 
3.4%
n 202426
 
3.3%
h 150237
 
2.4%
Other values (50) 1484235
23.8%

subfamily
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:18.853969image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length19.5
Mean length19.5
Min length17

Characters and Unicode

Total characters39
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowTroides amphrysus
2nd rowGynacantha membranalis
ValueCountFrequency (%)
troides 1
25.0%
amphrysus 1
25.0%
gynacantha 1
25.0%
membranalis 1
25.0%
2025-01-08T17:48:18.953381image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 6
15.4%
s 4
 
10.3%
n 3
 
7.7%
m 3
 
7.7%
r 3
 
7.7%
i 2
 
5.1%
e 2
 
5.1%
2
 
5.1%
h 2
 
5.1%
y 2
 
5.1%
Other values (10) 10
25.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 35
89.7%
Space Separator 2
 
5.1%
Uppercase Letter 2
 
5.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 6
17.1%
s 4
11.4%
n 3
8.6%
m 3
8.6%
r 3
8.6%
i 2
 
5.7%
e 2
 
5.7%
h 2
 
5.7%
y 2
 
5.7%
b 1
 
2.9%
Other values (7) 7
20.0%
Uppercase Letter
ValueCountFrequency (%)
T 1
50.0%
G 1
50.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 37
94.9%
Common 2
 
5.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 6
16.2%
s 4
10.8%
n 3
 
8.1%
m 3
 
8.1%
r 3
 
8.1%
i 2
 
5.4%
e 2
 
5.4%
h 2
 
5.4%
y 2
 
5.4%
T 1
 
2.7%
Other values (9) 9
24.3%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 39
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 6
15.4%
s 4
 
10.3%
n 3
 
7.7%
m 3
 
7.7%
r 3
 
7.7%
i 2
 
5.1%
e 2
 
5.1%
2
 
5.1%
h 2
 
5.1%
y 2
 
5.1%
Other values (10) 10
25.6%

subtribe
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:18.995383image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters6
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEML
2nd rowEML
ValueCountFrequency (%)
eml 2
100.0%
2025-01-08T17:48:19.078762image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 2
33.3%
M 2
33.3%
L 2
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 6
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 2
33.3%
M 2
33.3%
L 2
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 6
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 2
33.3%
M 2
33.3%
L 2
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 2
33.3%
M 2
33.3%
L 2
33.3%

genus
Text

Missing 

Distinct35722
Distinct (%)6.1%
Missing19883
Missing (%)3.3%
Memory size4.6 MiB
2025-01-08T17:48:19.252471image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length19
Mean length8.97094279
Min length3

Characters and Unicode

Total characters5245696
Distinct characters64
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11794 ?
Unique (%)2.0%

Sample

1st rowCamponotus
2nd rowAthrips
3rd rowParanthrene
4th rowAcanthagrion
5th rowCalathus
ValueCountFrequency (%)
bombus 62386
 
10.7%
xylocopa 11739
 
2.0%
argia 8660
 
1.5%
enallagma 7903
 
1.4%
crambus 7885
 
1.3%
ischnura 7465
 
1.3%
sympetrum 6026
 
1.0%
apis 4967
 
0.8%
erythrodiplax 4175
 
0.7%
lestes 4149
 
0.7%
Other values (35712) 459388
78.6%
2025-01-08T17:48:19.498255image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 535407
 
10.2%
o 472261
 
9.0%
s 396292
 
7.6%
i 368556
 
7.0%
e 354889
 
6.8%
r 324058
 
6.2%
l 257744
 
4.9%
u 248449
 
4.7%
t 231309
 
4.4%
m 228883
 
4.4%
Other values (54) 1827848
34.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4660907
88.9%
Uppercase Letter 584745
 
11.1%
Decimal Number 34
 
< 0.1%
Other Punctuation 6
 
< 0.1%
Dash Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 535407
11.5%
o 472261
 
10.1%
s 396292
 
8.5%
i 368556
 
7.9%
e 354889
 
7.6%
r 324058
 
7.0%
l 257744
 
5.5%
u 248449
 
5.3%
t 231309
 
5.0%
m 228883
 
4.9%
Other values (16) 1243059
26.7%
Uppercase Letter
ValueCountFrequency (%)
B 76807
13.1%
P 69931
12.0%
A 64651
11.1%
C 63912
10.9%
E 41054
 
7.0%
S 37255
 
6.4%
L 29087
 
5.0%
H 28222
 
4.8%
M 27165
 
4.6%
T 26554
 
4.5%
Other values (16) 120107
20.5%
Decimal Number
ValueCountFrequency (%)
2 9
26.5%
1 5
14.7%
0 4
11.8%
4 4
11.8%
3 3
 
8.8%
5 3
 
8.8%
9 2
 
5.9%
8 2
 
5.9%
6 2
 
5.9%
Other Punctuation
ValueCountFrequency (%)
: 4
66.7%
. 2
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5245652
> 99.9%
Common 44
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 535407
 
10.2%
o 472261
 
9.0%
s 396292
 
7.6%
i 368556
 
7.0%
e 354889
 
6.8%
r 324058
 
6.2%
l 257744
 
4.9%
u 248449
 
4.7%
t 231309
 
4.4%
m 228883
 
4.4%
Other values (42) 1827804
34.8%
Common
ValueCountFrequency (%)
2 9
20.5%
1 5
11.4%
0 4
9.1%
4 4
9.1%
- 4
9.1%
: 4
9.1%
3 3
 
6.8%
5 3
 
6.8%
9 2
 
4.5%
8 2
 
4.5%
Other values (2) 4
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5245696
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 535407
 
10.2%
o 472261
 
9.0%
s 396292
 
7.6%
i 368556
 
7.0%
e 354889
 
6.8%
r 324058
 
6.2%
l 257744
 
4.9%
u 248449
 
4.7%
t 231309
 
4.4%
m 228883
 
4.4%
Other values (54) 1827848
34.8%

genericName
Text

Missing 

Distinct38103
Distinct (%)6.5%
Missing19882
Missing (%)3.3%
Memory size4.6 MiB
2025-01-08T17:48:19.816900image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length19
Mean length8.918990191
Min length1

Characters and Unicode

Total characters5215326
Distinct characters65
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13468 ?
Unique (%)2.3%

Sample

1st rowCamponotus
2nd rowAthrips
3rd rowParanthrene
4th rowAcanthagrion
5th rowCalathus
ValueCountFrequency (%)
bombus 62365
 
10.7%
xylocopa 11743
 
2.0%
argia 8660
 
1.5%
enallagma 7977
 
1.4%
crambus 7970
 
1.4%
ischnura 7456
 
1.3%
sympetrum 6028
 
1.0%
apis 4968
 
0.8%
lestes 4235
 
0.7%
erythrodiplax 4175
 
0.7%
Other values (38093) 459167
78.5%
2025-01-08T17:48:20.073378image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 528684
 
10.1%
o 470494
 
9.0%
s 396333
 
7.6%
i 366131
 
7.0%
e 352590
 
6.8%
r 320159
 
6.1%
l 255087
 
4.9%
u 247647
 
4.7%
m 230840
 
4.4%
t 230398
 
4.4%
Other values (55) 1816963
34.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4630536
88.8%
Uppercase Letter 584735
 
11.2%
Decimal Number 34
 
< 0.1%
Other Punctuation 17
 
< 0.1%
Dash Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 528684
11.4%
o 470494
 
10.2%
s 396333
 
8.6%
i 366131
 
7.9%
e 352590
 
7.6%
r 320159
 
6.9%
l 255087
 
5.5%
u 247647
 
5.3%
m 230840
 
5.0%
t 230398
 
5.0%
Other values (18) 1232173
26.6%
Uppercase Letter
ValueCountFrequency (%)
B 76873
13.1%
P 68862
11.8%
A 65541
11.2%
C 63934
10.9%
E 40426
 
6.9%
S 36934
 
6.3%
L 31138
 
5.3%
T 27792
 
4.8%
H 27719
 
4.7%
M 26116
 
4.5%
Other values (16) 119400
20.4%
Decimal Number
ValueCountFrequency (%)
2 10
29.4%
1 8
23.5%
4 6
17.6%
0 4
 
11.8%
8 2
 
5.9%
3 2
 
5.9%
6 2
 
5.9%
Other Punctuation
ValueCountFrequency (%)
? 11
64.7%
: 4
 
23.5%
. 2
 
11.8%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5215271
> 99.9%
Common 55
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 528684
 
10.1%
o 470494
 
9.0%
s 396333
 
7.6%
i 366131
 
7.0%
e 352590
 
6.8%
r 320159
 
6.1%
l 255087
 
4.9%
u 247647
 
4.7%
m 230840
 
4.4%
t 230398
 
4.4%
Other values (44) 1816908
34.8%
Common
ValueCountFrequency (%)
? 11
20.0%
2 10
18.2%
1 8
14.5%
4 6
10.9%
0 4
 
7.3%
- 4
 
7.3%
: 4
 
7.3%
8 2
 
3.6%
3 2
 
3.6%
. 2
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5215319
> 99.9%
None 7
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 528684
 
10.1%
o 470494
 
9.0%
s 396333
 
7.6%
i 366131
 
7.0%
e 352590
 
6.8%
r 320159
 
6.1%
l 255087
 
4.9%
u 247647
 
4.7%
m 230840
 
4.4%
t 230398
 
4.4%
Other values (53) 1816956
34.8%
None
ValueCountFrequency (%)
ö 6
85.7%
ü 1
 
14.3%

subgenus
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:20.125684image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters8
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowtrue
2nd rowtrue
ValueCountFrequency (%)
true 2
100.0%
2025-01-08T17:48:20.209540image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 2
25.0%
r 2
25.0%
u 2
25.0%
e 2
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 2
25.0%
r 2
25.0%
u 2
25.0%
e 2
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 2
25.0%
r 2
25.0%
u 2
25.0%
e 2
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 2
25.0%
r 2
25.0%
u 2
25.0%
e 2
25.0%

specificEpithet
Text

Missing 

Distinct74464
Distinct (%)15.0%
Missing109508
Missing (%)18.1%
Memory size4.6 MiB
2025-01-08T17:48:20.403552image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length19
Mean length8.680070205
Min length2

Characters and Unicode

Total characters4297659
Distinct characters32
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique40224 ?
Unique (%)8.1%

Sample

1st rowrufoglaucus
2nd rowmesoleuca
3rd rowasilipennis
4th rowtrilobatum
5th rownanulus
ValueCountFrequency (%)
sylvicola 6282
 
1.3%
bifarius 4077
 
0.8%
kirbyellus 3621
 
0.7%
flavifrons 3474
 
0.7%
impatiens 3132
 
0.6%
nevadensis 2510
 
0.5%
cerana 2431
 
0.5%
affinis 2243
 
0.5%
mixtus 2136
 
0.4%
bimaculatus 2025
 
0.4%
Other values (74454) 463187
93.6%
2025-01-08T17:48:20.674199image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 563643
13.1%
i 497353
11.6%
s 385336
 
9.0%
e 331379
 
7.7%
l 295576
 
6.9%
n 285391
 
6.6%
r 268158
 
6.2%
u 259788
 
6.0%
t 231478
 
5.4%
c 208825
 
4.9%
Other values (22) 970732
22.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4297456
> 99.9%
Dash Punctuation 198
 
< 0.1%
Decimal Number 4
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 563643
13.1%
i 497353
11.6%
s 385336
 
9.0%
e 331379
 
7.7%
l 295576
 
6.9%
n 285391
 
6.6%
r 268158
 
6.2%
u 259788
 
6.0%
t 231478
 
5.4%
c 208825
 
4.9%
Other values (18) 970529
22.6%
Decimal Number
ValueCountFrequency (%)
1 2
50.0%
3 2
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 198
100.0%
Other Punctuation
ValueCountFrequency (%)
' 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4297456
> 99.9%
Common 203
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 563643
13.1%
i 497353
11.6%
s 385336
 
9.0%
e 331379
 
7.7%
l 295576
 
6.9%
n 285391
 
6.6%
r 268158
 
6.2%
u 259788
 
6.0%
t 231478
 
5.4%
c 208825
 
4.9%
Other values (18) 970529
22.6%
Common
ValueCountFrequency (%)
- 198
97.5%
1 2
 
1.0%
3 2
 
1.0%
' 1
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4297653
> 99.9%
None 6
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 563643
13.1%
i 497353
11.6%
s 385336
 
9.0%
e 331379
 
7.7%
l 295576
 
6.9%
n 285391
 
6.6%
r 268158
 
6.2%
u 259788
 
6.0%
t 231478
 
5.4%
c 208825
 
4.9%
Other values (20) 970726
22.6%
None
ValueCountFrequency (%)
ü 4
66.7%
ö 2
33.3%

infraspecificEpithet
Text

Missing 

Distinct4964
Distinct (%)27.2%
Missing586367
Missing (%)97.0%
Memory size4.6 MiB
2025-01-08T17:48:20.830449image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length17
Mean length8.306752834
Min length3

Characters and Unicode

Total characters151673
Distinct characters27
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3559 ?
Unique (%)19.5%

Sample

1st rowrufigenis
2nd rowmarianae
3rd rowneglectum
4th rowlavatus
5th rowfloridensis
ValueCountFrequency (%)
violacea 979
 
5.4%
vagans 869
 
4.8%
portia 724
 
4.0%
auricomus 587
 
3.2%
virginica 587
 
3.2%
dorsata 437
 
2.4%
arizonensis 431
 
2.4%
bantorum 320
 
1.8%
binghami 303
 
1.7%
californica 291
 
1.6%
Other values (4954) 12731
69.7%
2025-01-08T17:48:21.053855image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 22661
14.9%
i 18593
12.3%
s 12613
 
8.3%
n 10870
 
7.2%
r 10424
 
6.9%
e 9656
 
6.4%
o 9202
 
6.1%
c 7875
 
5.2%
u 7729
 
5.1%
l 7489
 
4.9%
Other values (17) 34561
22.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 151672
> 99.9%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 22661
14.9%
i 18593
12.3%
s 12613
 
8.3%
n 10870
 
7.2%
r 10424
 
6.9%
e 9656
 
6.4%
o 9202
 
6.1%
c 7875
 
5.2%
u 7729
 
5.1%
l 7489
 
4.9%
Other values (16) 34560
22.8%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 151672
> 99.9%
Common 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 22661
14.9%
i 18593
12.3%
s 12613
 
8.3%
n 10870
 
7.2%
r 10424
 
6.9%
e 9656
 
6.4%
o 9202
 
6.1%
c 7875
 
5.2%
u 7729
 
5.1%
l 7489
 
4.9%
Other values (16) 34560
22.8%
Common
ValueCountFrequency (%)
- 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 151673
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 22661
14.9%
i 18593
12.3%
s 12613
 
8.3%
n 10870
 
7.2%
r 10424
 
6.9%
e 9656
 
6.4%
o 9202
 
6.1%
c 7875
 
5.2%
u 7729
 
5.1%
l 7489
 
4.9%
Other values (17) 34561
22.8%

cultivarEpithet
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:21.113856image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length8.5
Mean length8.5
Min length4

Characters and Unicode

Total characters17
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowASIA
2nd rowLATIN_AMERICA
ValueCountFrequency (%)
asia 1
50.0%
latin_america 1
50.0%
2025-01-08T17:48:21.207635image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 5
29.4%
I 3
17.6%
S 1
 
5.9%
L 1
 
5.9%
T 1
 
5.9%
N 1
 
5.9%
_ 1
 
5.9%
M 1
 
5.9%
E 1
 
5.9%
R 1
 
5.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 16
94.1%
Connector Punctuation 1
 
5.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 5
31.2%
I 3
18.8%
S 1
 
6.2%
L 1
 
6.2%
T 1
 
6.2%
N 1
 
6.2%
M 1
 
6.2%
E 1
 
6.2%
R 1
 
6.2%
C 1
 
6.2%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16
94.1%
Common 1
 
5.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 5
31.2%
I 3
18.8%
S 1
 
6.2%
L 1
 
6.2%
T 1
 
6.2%
N 1
 
6.2%
M 1
 
6.2%
E 1
 
6.2%
R 1
 
6.2%
C 1
 
6.2%
Common
ValueCountFrequency (%)
_ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 5
29.4%
I 3
17.6%
S 1
 
5.9%
L 1
 
5.9%
T 1
 
5.9%
N 1
 
5.9%
_ 1
 
5.9%
M 1
 
5.9%
E 1
 
5.9%
R 1
 
5.9%
Distinct12
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-08T17:48:21.253064image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length7
Mean length6.758805472
Min length4

Characters and Unicode

Total characters4086536
Distinct characters22
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowVARIETY
2nd rowSPECIES
3rd rowSPECIES
4th rowSPECIES
5th rowSPECIES
ValueCountFrequency (%)
species 476863
78.9%
genus 89611
 
14.8%
subspecies 17825
 
2.9%
family 10445
 
1.7%
kingdom 4662
 
0.8%
order 4514
 
0.7%
variety 391
 
0.1%
class 253
 
< 0.1%
form 41
 
< 0.1%
unranked 11
 
< 0.1%
Other values (2) 8
 
< 0.1%
2025-01-08T17:48:21.348554image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 1097318
26.9%
E 1083905
26.5%
I 510188
12.5%
C 494943
12.1%
P 494694
12.1%
U 107453
 
2.6%
N 94297
 
2.3%
G 94273
 
2.3%
B 17825
 
0.4%
M 15156
 
0.4%
Other values (12) 76484
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4086534
> 99.9%
Connector Punctuation 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 1097318
26.9%
E 1083905
26.5%
I 510188
12.5%
C 494943
12.1%
P 494694
12.1%
U 107453
 
2.6%
N 94297
 
2.3%
G 94273
 
2.3%
B 17825
 
0.4%
M 15156
 
0.4%
Other values (11) 76482
 
1.9%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4086534
> 99.9%
Common 2
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 1097318
26.9%
E 1083905
26.5%
I 510188
12.5%
C 494943
12.1%
P 494694
12.1%
U 107453
 
2.6%
N 94297
 
2.3%
G 94273
 
2.3%
B 17825
 
0.4%
M 15156
 
0.4%
Other values (11) 76482
 
1.9%
Common
ValueCountFrequency (%)
_ 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4086536
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 1097318
26.9%
E 1083905
26.5%
I 510188
12.5%
C 494943
12.1%
P 494694
12.1%
U 107453
 
2.6%
N 94297
 
2.3%
G 94273
 
2.3%
B 17825
 
0.4%
M 15156
 
0.4%
Other values (12) 76484
 
1.9%

verbatimTaxonRank
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604625
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:21.386035image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters3
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowPER
ValueCountFrequency (%)
per 1
100.0%
2025-01-08T17:48:21.470994image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
P 1
33.3%
E 1
33.3%
R 1
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P 1
33.3%
E 1
33.3%
R 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 3
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
P 1
33.3%
E 1
33.3%
R 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
P 1
33.3%
E 1
33.3%
R 1
33.3%

vernacularName
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:21.509992image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters8
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowTYPE
2nd rowPeru
ValueCountFrequency (%)
type 1
50.0%
peru 1
50.0%
2025-01-08T17:48:21.598738image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
P 2
25.0%
T 1
12.5%
Y 1
12.5%
E 1
12.5%
e 1
12.5%
r 1
12.5%
u 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 5
62.5%
Lowercase Letter 3
37.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P 2
40.0%
T 1
20.0%
Y 1
20.0%
E 1
20.0%
Lowercase Letter
ValueCountFrequency (%)
e 1
33.3%
r 1
33.3%
u 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 8
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
P 2
25.0%
T 1
12.5%
Y 1
12.5%
E 1
12.5%
e 1
12.5%
r 1
12.5%
u 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
P 2
25.0%
T 1
12.5%
Y 1
12.5%
E 1
12.5%
e 1
12.5%
r 1
12.5%
u 1
12.5%

nomenclaturalCode
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604625
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:21.639769image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters7
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowPER.16_1
ValueCountFrequency (%)
per.16_1 1
100.0%
2025-01-08T17:48:21.722459image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2
25.0%
P 1
12.5%
E 1
12.5%
R 1
12.5%
. 1
12.5%
6 1
12.5%
_ 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3
37.5%
Uppercase Letter 3
37.5%
Other Punctuation 1
 
12.5%
Connector Punctuation 1
 
12.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P 1
33.3%
E 1
33.3%
R 1
33.3%
Decimal Number
ValueCountFrequency (%)
1 2
66.7%
6 1
33.3%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5
62.5%
Latin 3
37.5%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2
40.0%
. 1
20.0%
6 1
20.0%
_ 1
20.0%
Latin
ValueCountFrequency (%)
P 1
33.3%
E 1
33.3%
R 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2
25.0%
P 1
12.5%
E 1
12.5%
R 1
12.5%
. 1
12.5%
6 1
12.5%
_ 1
12.5%
Distinct4
Distinct (%)< 0.1%
Missing4647
Missing (%)0.8%
Memory size4.6 MiB
2025-01-08T17:48:21.764505image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.880410814
Min length4

Characters and Unicode

Total characters4728081
Distinct characters18
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowACCEPTED
2nd rowACCEPTED
3rd rowACCEPTED
4th rowACCEPTED
5th rowSYNONYM
ValueCountFrequency (%)
accepted 518943
86.5%
synonym 71747
 
12.0%
doubtful 9288
 
1.5%
lima 1
 
< 0.1%
2025-01-08T17:48:21.863944image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 1037886
22.0%
C 1037886
22.0%
T 528231
11.2%
D 528231
11.2%
A 518943
11.0%
P 518943
11.0%
N 143494
 
3.0%
Y 143494
 
3.0%
O 81035
 
1.7%
S 71747
 
1.5%
Other values (8) 118191
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4728078
> 99.9%
Lowercase Letter 3
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 1037886
22.0%
C 1037886
22.0%
T 528231
11.2%
D 528231
11.2%
A 518943
11.0%
P 518943
11.0%
N 143494
 
3.0%
Y 143494
 
3.0%
O 81035
 
1.7%
S 71747
 
1.5%
Other values (5) 118188
 
2.5%
Lowercase Letter
ValueCountFrequency (%)
i 1
33.3%
m 1
33.3%
a 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 4728081
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 1037886
22.0%
C 1037886
22.0%
T 528231
11.2%
D 528231
11.2%
A 518943
11.0%
P 518943
11.0%
N 143494
 
3.0%
Y 143494
 
3.0%
O 81035
 
1.7%
S 71747
 
1.5%
Other values (8) 118191
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4728081
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 1037886
22.0%
C 1037886
22.0%
T 528231
11.2%
D 528231
11.2%
A 518943
11.0%
P 518943
11.0%
N 143494
 
3.0%
Y 143494
 
3.0%
O 81035
 
1.7%
S 71747
 
1.5%
Other values (8) 118191
 
2.5%

nomenclaturalStatus
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604625
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:21.905945image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters10
Distinct characters7
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowPER.16.6_1
ValueCountFrequency (%)
per.16.6_1 1
100.0%
2025-01-08T17:48:21.993251image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 2
20.0%
1 2
20.0%
6 2
20.0%
P 1
10.0%
E 1
10.0%
R 1
10.0%
_ 1
10.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4
40.0%
Uppercase Letter 3
30.0%
Other Punctuation 2
20.0%
Connector Punctuation 1
 
10.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P 1
33.3%
E 1
33.3%
R 1
33.3%
Decimal Number
ValueCountFrequency (%)
1 2
50.0%
6 2
50.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 7
70.0%
Latin 3
30.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 2
28.6%
1 2
28.6%
6 2
28.6%
_ 1
14.3%
Latin
ValueCountFrequency (%)
P 1
33.3%
E 1
33.3%
R 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 2
20.0%
1 2
20.0%
6 2
20.0%
P 1
10.0%
E 1
10.0%
R 1
10.0%
_ 1
10.0%

taxonRemarks
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604625
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:22.034477image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters10
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowHuarochiri
ValueCountFrequency (%)
huarochiri 1
100.0%
2025-01-08T17:48:22.123940image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 2
20.0%
i 2
20.0%
H 1
10.0%
u 1
10.0%
a 1
10.0%
o 1
10.0%
c 1
10.0%
h 1
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9
90.0%
Uppercase Letter 1
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 2
22.2%
i 2
22.2%
u 1
11.1%
a 1
11.1%
o 1
11.1%
c 1
11.1%
h 1
11.1%
Uppercase Letter
ValueCountFrequency (%)
H 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 2
20.0%
i 2
20.0%
H 1
10.0%
u 1
10.0%
a 1
10.0%
o 1
10.0%
c 1
10.0%
h 1
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 2
20.0%
i 2
20.0%
H 1
10.0%
u 1
10.0%
a 1
10.0%
o 1
10.0%
c 1
10.0%
h 1
10.0%
Distinct2
Distinct (%)< 0.1%
Missing3
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-08T17:48:22.174942image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length36
Mean length35.99996196
Min length13

Characters and Unicode

Total characters21766405
Distinct characters21
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row821cc27a-e3bb-4bc5-ac34-89ada245069d
2nd row821cc27a-e3bb-4bc5-ac34-89ada245069d
3rd row821cc27a-e3bb-4bc5-ac34-89ada245069d
4th row821cc27a-e3bb-4bc5-ac34-89ada245069d
5th row821cc27a-e3bb-4bc5-ac34-89ada245069d
ValueCountFrequency (%)
821cc27a-e3bb-4bc5-ac34-89ada245069d 604622
> 99.9%
per.16.6.16_1 1
 
< 0.1%
2025-01-08T17:48:22.273515image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 2418488
11.1%
a 2418488
11.1%
- 2418488
11.1%
4 1813866
8.3%
b 1813866
8.3%
2 1813866
8.3%
d 1209244
 
5.6%
9 1209244
 
5.6%
5 1209244
 
5.6%
8 1209244
 
5.6%
Other values (11) 4232367
19.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10883202
50.0%
Lowercase Letter 8464708
38.9%
Dash Punctuation 2418488
 
11.1%
Other Punctuation 3
 
< 0.1%
Uppercase Letter 3
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 1813866
16.7%
2 1813866
16.7%
9 1209244
11.1%
5 1209244
11.1%
8 1209244
11.1%
3 1209244
11.1%
1 604625
 
5.6%
6 604625
 
5.6%
7 604622
 
5.6%
0 604622
 
5.6%
Lowercase Letter
ValueCountFrequency (%)
c 2418488
28.6%
a 2418488
28.6%
b 1813866
21.4%
d 1209244
14.3%
e 604622
 
7.1%
Uppercase Letter
ValueCountFrequency (%)
P 1
33.3%
E 1
33.3%
R 1
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 2418488
100.0%
Other Punctuation
ValueCountFrequency (%)
. 3
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 13301694
61.1%
Latin 8464711
38.9%

Most frequent character per script

Common
ValueCountFrequency (%)
- 2418488
18.2%
4 1813866
13.6%
2 1813866
13.6%
9 1209244
9.1%
5 1209244
9.1%
8 1209244
9.1%
3 1209244
9.1%
1 604625
 
4.5%
6 604625
 
4.5%
7 604622
 
4.5%
Other values (3) 604626
 
4.5%
Latin
ValueCountFrequency (%)
c 2418488
28.6%
a 2418488
28.6%
b 1813866
21.4%
d 1209244
14.3%
e 604622
 
7.1%
P 1
 
< 0.1%
E 1
 
< 0.1%
R 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21766405
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 2418488
11.1%
a 2418488
11.1%
- 2418488
11.1%
4 1813866
8.3%
b 1813866
8.3%
2 1813866
8.3%
d 1209244
 
5.6%
9 1209244
 
5.6%
5 1209244
 
5.6%
8 1209244
 
5.6%
Other values (11) 4232367
19.4%
Distinct2
Distinct (%)< 0.1%
Missing3
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-08T17:48:22.312514image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length2
Mean length2.000014885
Min length2

Characters and Unicode

Total characters1209255
Distinct characters9
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowUS
2nd rowUS
3rd rowUS
4th rowUS
5th rowUS
ValueCountFrequency (%)
us 604622
> 99.9%
san 1
 
< 0.1%
antonio 1
 
< 0.1%
2025-01-08T17:48:22.402453image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 604623
50.0%
U 604622
50.0%
n 3
 
< 0.1%
o 2
 
< 0.1%
a 1
 
< 0.1%
1
 
< 0.1%
A 1
 
< 0.1%
t 1
 
< 0.1%
i 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1209246
> 99.9%
Lowercase Letter 8
 
< 0.1%
Space Separator 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 3
37.5%
o 2
25.0%
a 1
 
12.5%
t 1
 
12.5%
i 1
 
12.5%
Uppercase Letter
ValueCountFrequency (%)
S 604623
50.0%
U 604622
50.0%
A 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1209254
> 99.9%
Common 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 604623
50.0%
U 604622
50.0%
n 3
 
< 0.1%
o 2
 
< 0.1%
a 1
 
< 0.1%
A 1
 
< 0.1%
t 1
 
< 0.1%
i 1
 
< 0.1%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1209255
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 604623
50.0%
U 604622
50.0%
n 3
 
< 0.1%
o 2
 
< 0.1%
a 1
 
< 0.1%
1
 
< 0.1%
A 1
 
< 0.1%
t 1
 
< 0.1%
i 1
 
< 0.1%
Distinct186893
Distinct (%)30.9%
Missing2
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-08T17:48:22.545855image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.9957792
Min length2

Characters and Unicode

Total characters14508424
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38990 ?
Unique (%)6.4%

Sample

1st row2024-12-02T13:57:44.315Z
2nd row2024-12-02T13:57:18.321Z
3rd row2024-12-02T13:59:05.381Z
4th row2024-12-02T13:57:22.450Z
5th row2024-12-02T13:57:21.275Z
ValueCountFrequency (%)
2024-12-02t13:57:45.539z 16
 
< 0.1%
2024-12-02t13:57:59.931z 16
 
< 0.1%
2024-12-02t13:57:53.908z 16
 
< 0.1%
2024-12-02t13:57:26.378z 16
 
< 0.1%
2024-12-02t13:57:29.420z 15
 
< 0.1%
2024-12-02t13:56:43.735z 15
 
< 0.1%
2024-12-02t13:57:51.108z 15
 
< 0.1%
2024-12-02t13:58:53.448z 15
 
< 0.1%
2024-12-02t13:56:41.760z 15
 
< 0.1%
2024-12-02t13:57:19.226z 15
 
< 0.1%
Other values (186883) 604470
> 99.9%
2025-01-08T17:48:22.747654image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2760432
19.0%
0 1532584
10.6%
1 1525143
10.5%
: 1209244
8.3%
- 1209244
8.3%
4 972748
 
6.7%
5 960823
 
6.6%
3 957684
 
6.6%
T 604622
 
4.2%
Z 604622
 
4.2%
Other values (7) 2171278
15.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10276693
70.8%
Other Punctuation 1813239
 
12.5%
Uppercase Letter 1209248
 
8.3%
Dash Punctuation 1209244
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2760432
26.9%
0 1532584
14.9%
1 1525143
14.8%
4 972748
 
9.5%
5 960823
 
9.3%
3 957684
 
9.3%
7 464169
 
4.5%
9 387034
 
3.8%
6 364187
 
3.5%
8 351889
 
3.4%
Uppercase Letter
ValueCountFrequency (%)
T 604622
50.0%
Z 604622
50.0%
L 2
 
< 0.1%
C 2
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
: 1209244
66.7%
. 603995
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 1209244
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 13299176
91.7%
Latin 1209248
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2760432
20.8%
0 1532584
11.5%
1 1525143
11.5%
: 1209244
9.1%
- 1209244
9.1%
4 972748
 
7.3%
5 960823
 
7.2%
3 957684
 
7.2%
. 603995
 
4.5%
7 464169
 
3.5%
Other values (3) 1103110
 
8.3%
Latin
ValueCountFrequency (%)
T 604622
50.0%
Z 604622
50.0%
L 2
 
< 0.1%
C 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14508424
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2760432
19.0%
0 1532584
10.6%
1 1525143
10.5%
: 1209244
8.3%
- 1209244
8.3%
4 972748
 
6.7%
5 960823
 
6.6%
3 957684
 
6.6%
T 604622
 
4.2%
Z 604622
 
4.2%
Other values (7) 2171278
15.0%

elevation
Text

Missing 

Distinct1990
Distinct (%)4.3%
Missing557870
Missing (%)92.3%
Memory size4.6 MiB
2025-01-08T17:48:22.932266image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length6
Mean length5.390794764
Min length3

Characters and Unicode

Total characters252052
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique527 ?
Unique (%)1.1%

Sample

1st row2040.0
2nd row240.0
3rd row165.0
4th row400.0
5th row1300.0
ValueCountFrequency (%)
2743.0 1163
 
2.5%
3353.0 875
 
1.9%
1524.0 704
 
1.5%
1829.0 659
 
1.4%
1100.0 556
 
1.2%
914.0 524
 
1.1%
427.0 524
 
1.1%
250.0 506
 
1.1%
200.0 496
 
1.1%
1372.0 495
 
1.1%
Other values (1976) 40254
86.1%
2025-01-08T17:48:23.177067image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 73766
29.3%
. 46756
18.6%
1 25753
 
10.2%
2 21657
 
8.6%
5 16361
 
6.5%
3 15333
 
6.1%
4 13857
 
5.5%
7 11572
 
4.6%
6 9583
 
3.8%
9 9338
 
3.7%
Other values (2) 8076
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 205275
81.4%
Other Punctuation 46756
 
18.6%
Dash Punctuation 21
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 73766
35.9%
1 25753
 
12.5%
2 21657
 
10.6%
5 16361
 
8.0%
3 15333
 
7.5%
4 13857
 
6.8%
7 11572
 
5.6%
6 9583
 
4.7%
9 9338
 
4.5%
8 8055
 
3.9%
Other Punctuation
ValueCountFrequency (%)
. 46756
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 252052
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 73766
29.3%
. 46756
18.6%
1 25753
 
10.2%
2 21657
 
8.6%
5 16361
 
6.5%
3 15333
 
6.1%
4 13857
 
5.5%
7 11572
 
4.6%
6 9583
 
3.8%
9 9338
 
3.7%
Other values (2) 8076
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 252052
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 73766
29.3%
. 46756
18.6%
1 25753
 
10.2%
2 21657
 
8.6%
5 16361
 
6.5%
3 15333
 
6.1%
4 13857
 
5.5%
7 11572
 
4.6%
6 9583
 
3.8%
9 9338
 
3.7%
Other values (2) 8076
 
3.2%

elevationAccuracy
Text

Missing 

Distinct215
Distinct (%)0.7%
Missing573282
Missing (%)94.8%
Memory size4.6 MiB
2025-01-08T17:48:23.294444image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length3
Mean length3.207886677
Min length3

Characters and Unicode

Total characters100548
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique90 ?
Unique (%)0.3%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0
ValueCountFrequency (%)
0.0 27237
86.9%
152.5 408
 
1.3%
30.5 325
 
1.0%
457.0 257
 
0.8%
100.0 249
 
0.8%
15.0 217
 
0.7%
914.0 185
 
0.6%
50.0 181
 
0.6%
305.0 175
 
0.6%
25.0 147
 
0.5%
Other values (205) 1963
 
6.3%
2025-01-08T17:48:23.460977image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 58739
58.4%
. 31342
31.2%
5 3647
 
3.6%
1 1753
 
1.7%
2 1231
 
1.2%
3 993
 
1.0%
7 914
 
0.9%
4 802
 
0.8%
6 537
 
0.5%
9 371
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 69206
68.8%
Other Punctuation 31342
31.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 58739
84.9%
5 3647
 
5.3%
1 1753
 
2.5%
2 1231
 
1.8%
3 993
 
1.4%
7 914
 
1.3%
4 802
 
1.2%
6 537
 
0.8%
9 371
 
0.5%
8 219
 
0.3%
Other Punctuation
ValueCountFrequency (%)
. 31342
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 100548
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 58739
58.4%
. 31342
31.2%
5 3647
 
3.6%
1 1753
 
1.7%
2 1231
 
1.2%
3 993
 
1.0%
7 914
 
0.9%
4 802
 
0.8%
6 537
 
0.5%
9 371
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 100548
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 58739
58.4%
. 31342
31.2%
5 3647
 
3.6%
1 1753
 
1.7%
2 1231
 
1.2%
3 993
 
1.0%
7 914
 
0.9%
4 802
 
0.8%
6 537
 
0.5%
9 371
 
0.4%

depth
Text

Missing 

Distinct12
Distinct (%)35.3%
Missing604592
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:23.514981image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length5
Mean length5.147058824
Min length5

Characters and Unicode

Total characters175
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)17.6%

Sample

1st row110.0
2nd row250.0
3rd row110.0
4th row370.0
5th row359.0
ValueCountFrequency (%)
250.0 9
26.5%
110.0 6
17.6%
880.0 6
17.6%
370.0 3
 
8.8%
1707.0 2
 
5.9%
775.0 2
 
5.9%
359.0 1
 
2.9%
1400.0 1
 
2.9%
1743.0 1
 
2.9%
500.0 1
 
2.9%
Other values (2) 2
 
5.9%
2025-01-08T17:48:23.615909image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 68
38.9%
. 34
19.4%
1 16
 
9.1%
5 13
 
7.4%
7 13
 
7.4%
8 12
 
6.9%
2 9
 
5.1%
3 6
 
3.4%
4 2
 
1.1%
9 1
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 141
80.6%
Other Punctuation 34
 
19.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 68
48.2%
1 16
 
11.3%
5 13
 
9.2%
7 13
 
9.2%
8 12
 
8.5%
2 9
 
6.4%
3 6
 
4.3%
4 2
 
1.4%
9 1
 
0.7%
6 1
 
0.7%
Other Punctuation
ValueCountFrequency (%)
. 34
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 175
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 68
38.9%
. 34
19.4%
1 16
 
9.1%
5 13
 
7.4%
7 13
 
7.4%
8 12
 
6.9%
2 9
 
5.1%
3 6
 
3.4%
4 2
 
1.1%
9 1
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 175
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 68
38.9%
. 34
19.4%
1 16
 
9.1%
5 13
 
7.4%
7 13
 
7.4%
8 12
 
6.9%
2 9
 
5.1%
3 6
 
3.4%
4 2
 
1.1%
9 1
 
0.6%

depthAccuracy
Text

Missing 

Distinct2
Distinct (%)18.2%
Missing604615
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:23.658247image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.090909091
Min length3

Characters and Unicode

Total characters45
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row110.0
2nd row110.0
3rd row0.0
4th row110.0
5th row0.0
ValueCountFrequency (%)
110.0 6
54.5%
0.0 5
45.5%
2025-01-08T17:48:23.754213image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 22
48.9%
1 12
26.7%
. 11
24.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 34
75.6%
Other Punctuation 11
 
24.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 22
64.7%
1 12
35.3%
Other Punctuation
ValueCountFrequency (%)
. 11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 45
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 22
48.9%
1 12
26.7%
. 11
24.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 45
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 22
48.9%
1 12
26.7%
. 11
24.4%
Distinct259
Distinct (%)8.6%
Missing601631
Missing (%)99.5%
Memory size4.6 MiB
2025-01-08T17:48:23.877037image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length17
Mean length14.27512521
Min length3

Characters and Unicode

Total characters42754
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique107 ?
Unique (%)3.6%

Sample

1st row4105.643932903784
2nd row4067.9280715056975
3rd row3039.0244431707993
4th row0.0
5th row2839.7634303533896
ValueCountFrequency (%)
0.0 634
21.2%
4105.643932903784 593
19.8%
949.7490617483568 164
 
5.5%
513.8699121355281 112
 
3.7%
4282.192003849806 80
 
2.7%
347.46362945305606 75
 
2.5%
1404.2075323592617 56
 
1.9%
512.1584099513866 45
 
1.5%
247.47802974000376 40
 
1.3%
3590.2355648532216 39
 
1.3%
Other values (249) 1157
38.6%
2025-01-08T17:48:24.068162image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 5141
12.0%
4 5070
11.9%
3 4873
11.4%
9 4010
9.4%
1 3629
8.5%
2 3557
8.3%
6 3536
8.3%
5 3528
8.3%
8 3361
7.9%
7 3054
7.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 39759
93.0%
Other Punctuation 2995
 
7.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 5141
12.9%
4 5070
12.8%
3 4873
12.3%
9 4010
10.1%
1 3629
9.1%
2 3557
8.9%
6 3536
8.9%
5 3528
8.9%
8 3361
8.5%
7 3054
7.7%
Other Punctuation
ValueCountFrequency (%)
. 2995
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 42754
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 5141
12.0%
4 5070
11.9%
3 4873
11.4%
9 4010
9.4%
1 3629
8.5%
2 3557
8.3%
6 3536
8.3%
5 3528
8.3%
8 3361
7.9%
7 3054
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42754
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 5141
12.0%
4 5070
11.9%
3 4873
11.4%
9 4010
9.4%
1 3629
8.5%
2 3557
8.3%
6 3536
8.3%
5 3528
8.3%
8 3361
7.9%
7 3054
7.1%

issue
Text

Distinct143
Distinct (%)< 0.1%
Missing2735
Missing (%)0.5%
Memory size4.6 MiB
2025-01-08T17:48:24.144415image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length200
Median length198
Mean length91.49480222
Min length15

Characters and Unicode

Total characters55069898
Distinct characters28
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)< 0.1%

Sample

1st rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
2nd rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
3rd rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
4th rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;GEODETIC_DATUM_ASSUMED_WGS84;CONTINENT_DERIVED_FROM_COORDINATES
5th rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;GEODETIC_DATUM_ASSUMED_WGS84;CONTINENT_DERIVED_FROM_COORDINATES
ValueCountFrequency (%)
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;continent_derived_from_coordinates 248827
41.3%
occurrence_status_inferred_from_individual_count 146633
24.4%
occurrence_status_inferred_from_individual_count;continent_derived_from_country 70820
 
11.8%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;continent_derived_from_coordinates;taxon_match_higherrank 32084
 
5.3%
occurrence_status_inferred_from_individual_count;taxon_match_higherrank 31693
 
5.3%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;geodetic_datum_invalid;continent_derived_from_coordinates 21011
 
3.5%
occurrence_status_inferred_from_individual_count;continent_derived_from_country;taxon_match_higherrank 11828
 
2.0%
occurrence_status_inferred_from_individual_count;taxon_match_fuzzy 6987
 
1.2%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;continent_derived_from_coordinates;taxon_match_fuzzy 6167
 
1.0%
occurrence_status_inferred_from_individual_count;country_invalid 5364
 
0.9%
Other values (133) 20477
 
3.4%
2025-01-08T17:48:24.278272image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 5457778
9.9%
E 5050863
 
9.2%
R 4409788
 
8.0%
N 4266550
 
7.7%
I 4035031
 
7.3%
D 3989481
 
7.2%
T 3935075
 
7.1%
O 3811449
 
6.9%
C 3682307
 
6.7%
U 3181760
 
5.8%
Other values (18) 13249816
24.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 48110489
87.4%
Connector Punctuation 5457778
 
9.9%
Other Punctuation 865049
 
1.6%
Decimal Number 636582
 
1.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 5050863
10.5%
R 4409788
9.2%
N 4266550
8.9%
I 4035031
8.4%
D 3989481
8.3%
T 3935075
8.2%
O 3811449
7.9%
C 3682307
 
7.7%
U 3181760
 
6.6%
A 2514457
 
5.2%
Other values (14) 9233728
19.2%
Decimal Number
ValueCountFrequency (%)
8 318291
50.0%
4 318291
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 5457778
100.0%
Other Punctuation
ValueCountFrequency (%)
; 865049
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 48110489
87.4%
Common 6959409
 
12.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 5050863
10.5%
R 4409788
9.2%
N 4266550
8.9%
I 4035031
8.4%
D 3989481
8.3%
T 3935075
8.2%
O 3811449
7.9%
C 3682307
 
7.7%
U 3181760
 
6.6%
A 2514457
 
5.2%
Other values (14) 9233728
19.2%
Common
ValueCountFrequency (%)
_ 5457778
78.4%
; 865049
 
12.4%
8 318291
 
4.6%
4 318291
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 55069898
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_ 5457778
9.9%
E 5050863
 
9.2%
R 4409788
 
8.0%
N 4266550
 
7.7%
I 4035031
 
7.3%
D 3989481
 
7.2%
T 3935075
 
7.1%
O 3811449
 
6.9%
C 3682307
 
6.7%
U 3181760
 
5.8%
Other values (18) 13249816
24.1%

mediaType
Text

Missing 

Distinct19
Distinct (%)< 0.1%
Missing369838
Missing (%)61.2%
Memory size4.6 MiB
2025-01-08T17:48:24.328491image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length241
Median length10
Mean length15.46893794
Min length10

Characters and Unicode

Total characters3631921
Distinct characters10
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowStillImage
2nd rowStillImage
3rd rowStillImage
4th rowStillImage
5th rowStillImage
ValueCountFrequency (%)
stillimage 192418
82.0%
stillimage;stillimage;stillimage;stillimage 14664
 
6.2%
stillimage;stillimage;stillimage 10110
 
4.3%
stillimage;stillimage 8407
 
3.6%
stillimage;stillimage;stillimage;stillimage;stillimage 5480
 
2.3%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 1551
 
0.7%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 1253
 
0.5%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 626
 
0.3%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 139
 
0.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 71
 
< 0.1%
Other values (9) 69
 
< 0.1%
2025-01-08T17:48:24.435870image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 703038
19.4%
S 351519
9.7%
t 351519
9.7%
i 351519
9.7%
I 351519
9.7%
m 351519
9.7%
a 351519
9.7%
g 351519
9.7%
e 351519
9.7%
; 116731
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2812152
77.4%
Uppercase Letter 703038
 
19.4%
Other Punctuation 116731
 
3.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 703038
25.0%
t 351519
12.5%
i 351519
12.5%
m 351519
12.5%
a 351519
12.5%
g 351519
12.5%
e 351519
12.5%
Uppercase Letter
ValueCountFrequency (%)
S 351519
50.0%
I 351519
50.0%
Other Punctuation
ValueCountFrequency (%)
; 116731
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3515190
96.8%
Common 116731
 
3.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 703038
20.0%
S 351519
10.0%
t 351519
10.0%
i 351519
10.0%
I 351519
10.0%
m 351519
10.0%
a 351519
10.0%
g 351519
10.0%
e 351519
10.0%
Common
ValueCountFrequency (%)
; 116731
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3631921
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 703038
19.4%
S 351519
9.7%
t 351519
9.7%
i 351519
9.7%
I 351519
9.7%
m 351519
9.7%
a 351519
9.7%
g 351519
9.7%
e 351519
9.7%
; 116731
 
3.2%
Distinct4
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-08T17:48:24.486825image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length35
Median length4
Mean length4.472394414
Min length4

Characters and Unicode

Total characters2704117
Distinct characters33
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowfalse
2nd rowfalse
3rd rowfalse
4th rowtrue
5th rowtrue
ValueCountFrequency (%)
true 319051
52.8%
false 285571
47.2%
trogoderma 1
 
< 0.1%
dejean 1
 
< 0.1%
1821 1
 
< 0.1%
aphytis 1
 
< 0.1%
roseni 1
 
< 0.1%
debach 1
 
< 0.1%
1
 
< 0.1%
gordh 1
 
< 0.1%
2025-01-08T17:48:24.595611image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 604627
22.4%
r 319055
11.8%
t 319052
11.8%
u 319051
11.8%
a 285574
10.6%
s 285573
10.6%
f 285571
10.6%
l 285571
10.6%
7
 
< 0.1%
o 4
 
< 0.1%
Other values (23) 32
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2704093
> 99.9%
Decimal Number 8
 
< 0.1%
Space Separator 7
 
< 0.1%
Uppercase Letter 6
 
< 0.1%
Other Punctuation 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 604627
22.4%
r 319055
11.8%
t 319052
11.8%
u 319051
11.8%
a 285574
10.6%
s 285573
10.6%
f 285571
10.6%
l 285571
10.6%
o 4
 
< 0.1%
h 3
 
< 0.1%
Other values (9) 12
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 3
37.5%
7 1
 
12.5%
9 1
 
12.5%
2 1
 
12.5%
8 1
 
12.5%
4 1
 
12.5%
Uppercase Letter
ValueCountFrequency (%)
D 2
33.3%
T 1
16.7%
G 1
16.7%
B 1
16.7%
A 1
16.7%
Other Punctuation
ValueCountFrequency (%)
, 2
66.7%
& 1
33.3%
Space Separator
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2704099
> 99.9%
Common 18
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 604627
22.4%
r 319055
11.8%
t 319052
11.8%
u 319051
11.8%
a 285574
10.6%
s 285573
10.6%
f 285571
10.6%
l 285571
10.6%
o 4
 
< 0.1%
h 3
 
< 0.1%
Other values (14) 18
 
< 0.1%
Common
ValueCountFrequency (%)
7
38.9%
1 3
16.7%
, 2
 
11.1%
7 1
 
5.6%
9 1
 
5.6%
& 1
 
5.6%
2 1
 
5.6%
8 1
 
5.6%
4 1
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2704117
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 604627
22.4%
r 319055
11.8%
t 319052
11.8%
u 319051
11.8%
a 285574
10.6%
s 285573
10.6%
f 285571
10.6%
l 285571
10.6%
7
 
< 0.1%
o 4
 
< 0.1%
Other values (23) 32
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing4
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-08T17:48:24.635677image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.99879098
Min length4

Characters and Unicode

Total characters3022379
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 603891
99.9%
true 731
 
0.1%
2025-01-08T17:48:24.725708image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 604622
20.0%
f 603891
20.0%
a 603891
20.0%
l 603891
20.0%
s 603891
20.0%
t 731
 
< 0.1%
r 731
 
< 0.1%
u 731
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3022379
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 604622
20.0%
f 603891
20.0%
a 603891
20.0%
l 603891
20.0%
s 603891
20.0%
t 731
 
< 0.1%
r 731
 
< 0.1%
u 731
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 3022379
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 604622
20.0%
f 603891
20.0%
a 603891
20.0%
l 603891
20.0%
s 603891
20.0%
t 731
 
< 0.1%
r 731
 
< 0.1%
u 731
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3022379
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 604622
20.0%
f 603891
20.0%
a 603891
20.0%
l 603891
20.0%
s 603891
20.0%
t 731
 
< 0.1%
r 731
 
< 0.1%
u 731
 
< 0.1%
Distinct203336
Distinct (%)33.6%
Missing4
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-08T17:48:24.964648image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.907198216
Min length1

Characters and Unicode

Total characters4176244
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique154756 ?
Unique (%)25.6%

Sample

1st row7866975
2nd row5122189
3rd row1939887
4th row1422444
5th row7820915
ValueCountFrequency (%)
1340278 10672
 
1.8%
1340525 6264
 
1.0%
0 4644
 
0.8%
1340393 4071
 
0.7%
10976534 3621
 
0.6%
789 3466
 
0.6%
1340467 3340
 
0.6%
9164 3176
 
0.5%
1340350 3129
 
0.5%
1341979 2431
 
0.4%
Other values (203326) 559808
92.6%
2025-01-08T17:48:25.273624image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 694706
16.6%
4 514772
12.3%
0 431191
10.3%
2 416968
10.0%
3 414819
9.9%
5 390530
9.4%
9 337522
8.1%
8 337016
8.1%
7 334447
8.0%
6 304273
7.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4176244
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 694706
16.6%
4 514772
12.3%
0 431191
10.3%
2 416968
10.0%
3 414819
9.9%
5 390530
9.4%
9 337522
8.1%
8 337016
8.1%
7 334447
8.0%
6 304273
7.3%

Most occurring scripts

ValueCountFrequency (%)
Common 4176244
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 694706
16.6%
4 514772
12.3%
0 431191
10.3%
2 416968
10.0%
3 414819
9.9%
5 390530
9.4%
9 337522
8.1%
8 337016
8.1%
7 334447
8.0%
6 304273
7.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4176244
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 694706
16.6%
4 514772
12.3%
0 431191
10.3%
2 416968
10.0%
3 414819
9.9%
5 390530
9.4%
9 337522
8.1%
8 337016
8.1%
7 334447
8.0%
6 304273
7.3%
Distinct188378
Distinct (%)31.4%
Missing4648
Missing (%)0.8%
Memory size4.6 MiB
2025-01-08T17:48:25.517708image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.955165023
Min length1

Characters and Unicode

Total characters4172946
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique134600 ?
Unique (%)22.4%

Sample

1st row7866975
2nd row5122189
3rd row1939887
4th row1422444
5th row4988370
ValueCountFrequency (%)
1340278 10672
 
1.8%
1340525 6265
 
1.0%
1340393 4073
 
0.7%
10409744 3623
 
0.6%
789 3466
 
0.6%
1340467 3343
 
0.6%
9164 3176
 
0.5%
1340350 3129
 
0.5%
1341979 2431
 
0.4%
1340485 2119
 
0.4%
Other values (188368) 557681
93.0%
2025-01-08T17:48:25.814686image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 709890
17.0%
4 525521
12.6%
0 431132
10.3%
2 418685
10.0%
3 411620
9.9%
5 382598
9.2%
8 332590
8.0%
7 330905
7.9%
9 329832
7.9%
6 300173
7.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4172946
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 709890
17.0%
4 525521
12.6%
0 431132
10.3%
2 418685
10.0%
3 411620
9.9%
5 382598
9.2%
8 332590
8.0%
7 330905
7.9%
9 329832
7.9%
6 300173
7.2%

Most occurring scripts

ValueCountFrequency (%)
Common 4172946
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 709890
17.0%
4 525521
12.6%
0 431132
10.3%
2 418685
10.0%
3 411620
9.9%
5 382598
9.2%
8 332590
8.0%
7 330905
7.9%
9 329832
7.9%
6 300173
7.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4172946
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 709890
17.0%
4 525521
12.6%
0 431132
10.3%
2 418685
10.0%
3 411620
9.9%
5 382598
9.2%
8 332590
8.0%
7 330905
7.9%
9 329832
7.9%
6 300173
7.2%
Distinct2
Distinct (%)< 0.1%
Missing4
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-08T17:48:25.866478image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters604622
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
1 599978
99.2%
0 4644
 
0.8%
2025-01-08T17:48:26.081227image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 599978
99.2%
0 4644
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 604622
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 599978
99.2%
0 4644
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
Common 604622
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 599978
99.2%
0 4644
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 604622
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 599978
99.2%
0 4644
 
0.8%
Distinct7
Distinct (%)< 0.1%
Missing5247
Missing (%)0.9%
Memory size4.6 MiB
2025-01-08T17:48:26.124342image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1198758
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row54
2nd row54
3rd row54
4th row54
5th row54
ValueCountFrequency (%)
54 599346
> 99.9%
43 18
 
< 0.1%
62 6
 
< 0.1%
52 5
 
< 0.1%
44 2
 
< 0.1%
63 1
 
< 0.1%
50 1
 
< 0.1%
2025-01-08T17:48:26.213990image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 599368
50.0%
5 599352
50.0%
3 19
 
< 0.1%
2 11
 
< 0.1%
6 7
 
< 0.1%
0 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1198758
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 599368
50.0%
5 599352
50.0%
3 19
 
< 0.1%
2 11
 
< 0.1%
6 7
 
< 0.1%
0 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 1198758
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 599368
50.0%
5 599352
50.0%
3 19
 
< 0.1%
2 11
 
< 0.1%
6 7
 
< 0.1%
0 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1198758
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 599368
50.0%
5 599352
50.0%
3 19
 
< 0.1%
2 11
 
< 0.1%
6 7
 
< 0.1%
0 1
 
< 0.1%
Distinct13
Distinct (%)< 0.1%
Missing5283
Missing (%)0.9%
Memory size4.6 MiB
2025-01-08T17:48:26.256631image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.008053819
Min length3

Characters and Unicode

Total characters1802856
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row216
2nd row216
3rd row216
4th row216
5th row216
ValueCountFrequency (%)
216 588111
98.1%
367 7917
 
1.3%
361 1599
 
0.3%
10713444 820
 
0.1%
360 736
 
0.1%
11374670 77
 
< 0.1%
11377931 62
 
< 0.1%
7742773 8
 
< 0.1%
229 5
 
< 0.1%
143 4
 
< 0.1%
Other values (3) 4
 
< 0.1%
2025-01-08T17:48:26.354092image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 598440
33.2%
1 591697
32.8%
2 588136
32.6%
3 11285
 
0.6%
7 9047
 
0.5%
4 2549
 
0.1%
0 1633
 
0.1%
9 67
 
< 0.1%
5 2
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1802856
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 598440
33.2%
1 591697
32.8%
2 588136
32.6%
3 11285
 
0.6%
7 9047
 
0.5%
4 2549
 
0.1%
0 1633
 
0.1%
9 67
 
< 0.1%
5 2
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 1802856
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 598440
33.2%
1 591697
32.8%
2 588136
32.6%
3 11285
 
0.6%
7 9047
 
0.5%
4 2549
 
0.1%
0 1633
 
0.1%
9 67
 
< 0.1%
5 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1802856
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 598440
33.2%
1 591697
32.8%
2 588136
32.6%
3 11285
 
0.6%
7 9047
 
0.5%
4 2549
 
0.1%
0 1633
 
0.1%
9 67
 
< 0.1%
5 2
 
< 0.1%
Distinct74
Distinct (%)< 0.1%
Missing5577
Missing (%)0.9%
Memory size4.6 MiB
2025-01-08T17:48:26.420520image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length55
Median length3
Mean length3.462212607
Min length3

Characters and Unicode

Total characters2074035
Distinct characters33
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)< 0.1%

Sample

1st row1457
2nd row797
3rd row797
4th row789
5th row1470
ValueCountFrequency (%)
1457 146330
24.4%
789 117284
19.6%
797 99491
16.6%
811 73566
12.3%
1470 71961
12.0%
809 37757
 
6.3%
1366 10087
 
1.7%
1003 9104
 
1.5%
1228 4628
 
0.8%
1496 4624
 
0.8%
Other values (69) 24225
 
4.0%
2025-01-08T17:48:26.544168image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 545010
26.3%
1 417485
20.1%
9 260222
12.5%
8 248345
12.0%
4 231789
11.2%
5 157191
 
7.6%
0 140519
 
6.8%
6 30519
 
1.5%
3 24309
 
1.2%
2 18537
 
0.9%
Other values (23) 109
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2073926
> 99.9%
Lowercase Letter 83
 
< 0.1%
Uppercase Letter 10
 
< 0.1%
Space Separator 8
 
< 0.1%
Other Punctuation 8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 12
14.5%
e 10
12.0%
i 7
8.4%
r 7
8.4%
t 7
8.4%
o 7
8.4%
n 6
7.2%
p 6
7.2%
d 4
 
4.8%
l 4
 
4.8%
Other values (6) 13
15.7%
Decimal Number
ValueCountFrequency (%)
7 545010
26.3%
1 417485
20.1%
9 260222
12.5%
8 248345
12.0%
4 231789
11.2%
5 157191
 
7.6%
0 140519
 
6.8%
6 30519
 
1.5%
3 24309
 
1.2%
2 18537
 
0.9%
Uppercase Letter
ValueCountFrequency (%)
A 5
50.0%
I 2
 
20.0%
C 1
 
10.0%
B 1
 
10.0%
H 1
 
10.0%
Space Separator
ValueCountFrequency (%)
8
100.0%
Other Punctuation
ValueCountFrequency (%)
, 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2073942
> 99.9%
Latin 93
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 12
12.9%
e 10
10.8%
i 7
 
7.5%
r 7
 
7.5%
t 7
 
7.5%
o 7
 
7.5%
n 6
 
6.5%
p 6
 
6.5%
A 5
 
5.4%
d 4
 
4.3%
Other values (11) 22
23.7%
Common
ValueCountFrequency (%)
7 545010
26.3%
1 417485
20.1%
9 260222
12.5%
8 248345
12.0%
4 231789
11.2%
5 157191
 
7.6%
0 140519
 
6.8%
6 30519
 
1.5%
3 24309
 
1.2%
2 18537
 
0.9%
Other values (2) 16
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2074035
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 545010
26.3%
1 417485
20.1%
9 260222
12.5%
8 248345
12.0%
4 231789
11.2%
5 157191
 
7.6%
0 140519
 
6.8%
6 30519
 
1.5%
3 24309
 
1.2%
2 18537
 
0.9%
Other values (23) 109
 
< 0.1%

familyKey
Text

Missing 

Distinct1493
Distinct (%)0.3%
Missing11642
Missing (%)1.9%
Memory size4.6 MiB
2025-01-08T17:48:26.717140image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length4
Mean length4.092565735
Min length4

Characters and Unicode

Total characters2426826
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique194 ?
Unique (%)< 0.1%

Sample

1st row4342
2nd row3553
3rd row5340
4th row8577
5th row3792
ValueCountFrequency (%)
4334 82646
 
13.9%
5936 42503
 
7.2%
8577 36255
 
6.1%
7780 17448
 
2.9%
8841 13614
 
2.3%
7275 13374
 
2.3%
6950 12793
 
2.2%
9164 11788
 
2.0%
4239 11689
 
2.0%
4342 9878
 
1.7%
Other values (1483) 340996
57.5%
2025-01-08T17:48:26.960512image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 429068
17.7%
3 397892
16.4%
7 294264
12.1%
5 278623
11.5%
9 230925
9.5%
8 216518
8.9%
6 162474
 
6.7%
0 149228
 
6.1%
2 136060
 
5.6%
1 131758
 
5.4%
Other values (6) 16
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2426810
> 99.9%
Lowercase Letter 14
 
< 0.1%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 429068
17.7%
3 397892
16.4%
7 294264
12.1%
5 278623
11.5%
9 230925
9.5%
8 216518
8.9%
6 162474
 
6.7%
0 149228
 
6.1%
2 136060
 
5.6%
1 131758
 
5.4%
Lowercase Letter
ValueCountFrequency (%)
i 4
28.6%
a 4
28.6%
n 2
14.3%
m 2
14.3%
l 2
14.3%
Uppercase Letter
ValueCountFrequency (%)
A 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2426810
> 99.9%
Latin 16
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
4 429068
17.7%
3 397892
16.4%
7 294264
12.1%
5 278623
11.5%
9 230925
9.5%
8 216518
8.9%
6 162474
 
6.7%
0 149228
 
6.1%
2 136060
 
5.6%
1 131758
 
5.4%
Latin
ValueCountFrequency (%)
i 4
25.0%
a 4
25.0%
A 2
12.5%
n 2
12.5%
m 2
12.5%
l 2
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2426826
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 429068
17.7%
3 397892
16.4%
7 294264
12.1%
5 278623
11.5%
9 230925
9.5%
8 216518
8.9%
6 162474
 
6.7%
0 149228
 
6.1%
2 136060
 
5.6%
1 131758
 
5.4%
Other values (6) 16
 
< 0.1%

genusKey
Text

Missing 

Distinct35818
Distinct (%)6.1%
Missing19883
Missing (%)3.3%
Memory size4.6 MiB
2025-01-08T17:48:27.157968image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length7
Mean length7.009882974
Min length7

Characters and Unicode

Total characters4098980
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11846 ?
Unique (%)2.0%

Sample

1st row1312361
2nd row1851754
3rd row7876391
4th row1422438
5th row4988347
ValueCountFrequency (%)
1340278 62386
 
10.7%
1342048 11739
 
2.0%
1422607 8660
 
1.5%
1422099 7903
 
1.4%
1879915 7885
 
1.3%
1423281 7465
 
1.3%
1428195 6026
 
1.0%
1334757 4967
 
0.8%
1428967 4175
 
0.7%
1423980 4149
 
0.7%
Other values (35808) 459388
78.6%
2025-01-08T17:48:27.406641image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 731946
17.9%
4 516466
12.6%
2 477202
11.6%
0 399327
9.7%
3 388191
9.5%
7 374798
9.1%
8 359928
8.8%
9 304581
7.4%
6 276553
 
6.7%
5 269968
 
6.6%
Other values (8) 20
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4098960
> 99.9%
Lowercase Letter 18
 
< 0.1%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 731946
17.9%
4 516466
12.6%
2 477202
11.6%
0 399327
9.7%
3 388191
9.5%
7 374798
9.1%
8 359928
8.8%
9 304581
7.4%
6 276553
 
6.7%
5 269968
 
6.6%
Lowercase Letter
ValueCountFrequency (%)
r 4
22.2%
o 4
22.2%
t 2
11.1%
h 2
11.1%
p 2
11.1%
d 2
11.1%
a 2
11.1%
Uppercase Letter
ValueCountFrequency (%)
A 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4098960
> 99.9%
Latin 20
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 731946
17.9%
4 516466
12.6%
2 477202
11.6%
0 399327
9.7%
3 388191
9.5%
7 374798
9.1%
8 359928
8.8%
9 304581
7.4%
6 276553
 
6.7%
5 269968
 
6.6%
Latin
ValueCountFrequency (%)
r 4
20.0%
o 4
20.0%
A 2
10.0%
t 2
10.0%
h 2
10.0%
p 2
10.0%
d 2
10.0%
a 2
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4098980
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 731946
17.9%
4 516466
12.6%
2 477202
11.6%
0 399327
9.7%
3 388191
9.5%
7 374798
9.1%
8 359928
8.8%
9 304581
7.4%
6 276553
 
6.7%
5 269968
 
6.6%
Other values (8) 20
 
< 0.1%

subgenusKey
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing604624
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:27.461159image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters14
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowInsecta
2nd rowInsecta
ValueCountFrequency (%)
insecta 2
100.0%
2025-01-08T17:48:27.547680image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 2
14.3%
n 2
14.3%
s 2
14.3%
e 2
14.3%
c 2
14.3%
t 2
14.3%
a 2
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12
85.7%
Uppercase Letter 2
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 2
16.7%
s 2
16.7%
e 2
16.7%
c 2
16.7%
t 2
16.7%
a 2
16.7%
Uppercase Letter
ValueCountFrequency (%)
I 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 14
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 2
14.3%
n 2
14.3%
s 2
14.3%
e 2
14.3%
c 2
14.3%
t 2
14.3%
a 2
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 2
14.3%
n 2
14.3%
s 2
14.3%
e 2
14.3%
c 2
14.3%
t 2
14.3%
a 2
14.3%

speciesKey
Text

Missing 

Distinct169008
Distinct (%)34.1%
Missing109501
Missing (%)18.1%
Memory size4.6 MiB
2025-01-08T17:48:27.782074image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length7
Mean length7.040145418
Min length7

Characters and Unicode

Total characters3485752
Distinct characters22
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique121718 ?
Unique (%)24.6%

Sample

1st row1313073
2nd row5122189
3rd row1939887
4th row1422444
5th row4988370
ValueCountFrequency (%)
1340525 6265
 
1.3%
1340393 4073
 
0.8%
10409744 3623
 
0.7%
1340467 3343
 
0.7%
1340350 3129
 
0.6%
1341979 2431
 
0.5%
1340485 2119
 
0.4%
1340382 1985
 
0.4%
1419322 1947
 
0.4%
1423305 1920
 
0.4%
Other values (168998) 464290
93.8%
2025-01-08T17:48:28.081817image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 616872
17.7%
4 435743
12.5%
0 362737
10.4%
3 350994
10.1%
2 349089
10.0%
5 335200
9.6%
9 275869
7.9%
8 271905
7.8%
7 255954
7.3%
6 231368
 
6.6%
Other values (12) 21
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3485731
> 99.9%
Lowercase Letter 19
 
< 0.1%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 616872
17.7%
4 435743
12.5%
0 362737
10.4%
3 350994
10.1%
2 349089
10.0%
5 335200
9.6%
9 275869
7.9%
8 271905
7.8%
7 255954
7.3%
6 231368
 
6.6%
Lowercase Letter
ValueCountFrequency (%)
e 4
21.1%
o 3
15.8%
p 2
10.5%
t 2
10.5%
r 2
10.5%
a 2
10.5%
l 1
 
5.3%
y 1
 
5.3%
m 1
 
5.3%
n 1
 
5.3%
Uppercase Letter
ValueCountFrequency (%)
C 1
50.0%
H 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3485731
> 99.9%
Latin 21
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 4
19.0%
o 3
14.3%
p 2
9.5%
t 2
9.5%
r 2
9.5%
a 2
9.5%
l 1
 
4.8%
C 1
 
4.8%
H 1
 
4.8%
y 1
 
4.8%
Other values (2) 2
9.5%
Common
ValueCountFrequency (%)
1 616872
17.7%
4 435743
12.5%
0 362737
10.4%
3 350994
10.1%
2 349089
10.0%
5 335200
9.6%
9 275869
7.9%
8 271905
7.8%
7 255954
7.3%
6 231368
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3485752
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 616872
17.7%
4 435743
12.5%
0 362737
10.4%
3 350994
10.1%
2 349089
10.0%
5 335200
9.6%
9 275869
7.9%
8 271905
7.8%
7 255954
7.3%
6 231368
 
6.6%
Other values (12) 21
 
< 0.1%

species
Text

Missing 

Distinct168987
Distinct (%)34.1%
Missing109503
Missing (%)18.1%
Memory size4.6 MiB
2025-01-08T17:48:28.293296image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length32
Mean length18.63352945
Min length6

Characters and Unicode

Total characters9225889
Distinct characters54
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique121697 ?
Unique (%)24.6%

Sample

1st rowCamponotus rufoglaucus
2nd rowAthrips mesoleuca
3rd rowParanthrene asilipennis
4th rowAcanthagrion trilobatum
5th rowCalathus ingratus
ValueCountFrequency (%)
bombus 51714
 
5.2%
xylocopa 9795
 
1.0%
argia 8430
 
0.9%
enallagma 7850
 
0.8%
crambus 7738
 
0.8%
ischnura 7433
 
0.8%
sylvicola 6290
 
0.6%
sympetrum 5960
 
0.6%
apis 4956
 
0.5%
lestes 4143
 
0.4%
Other values (101139) 875937
88.5%
2025-01-08T17:48:28.578096image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1016887
 
11.0%
i 815089
 
8.8%
s 718014
 
7.8%
e 632080
 
6.9%
o 593441
 
6.4%
r 547183
 
5.9%
l 514243
 
5.6%
495123
 
5.4%
u 469909
 
5.1%
n 449274
 
4.9%
Other values (44) 2974646
32.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8235554
89.3%
Space Separator 495123
 
5.4%
Uppercase Letter 495123
 
5.4%
Dash Punctuation 89
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1016887
12.3%
i 815089
 
9.9%
s 718014
 
8.7%
e 632080
 
7.7%
o 593441
 
7.2%
r 547183
 
6.6%
l 514243
 
6.2%
u 469909
 
5.7%
n 449274
 
5.5%
t 427269
 
5.2%
Other values (16) 2052165
24.9%
Uppercase Letter
ValueCountFrequency (%)
B 64094
12.9%
P 59463
12.0%
C 54933
11.1%
A 54809
11.1%
E 36639
 
7.4%
S 29916
 
6.0%
L 25361
 
5.1%
H 23318
 
4.7%
M 22563
 
4.6%
T 21616
 
4.4%
Other values (16) 102411
20.7%
Space Separator
ValueCountFrequency (%)
495123
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 89
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8730677
94.6%
Common 495212
 
5.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1016887
 
11.6%
i 815089
 
9.3%
s 718014
 
8.2%
e 632080
 
7.2%
o 593441
 
6.8%
r 547183
 
6.3%
l 514243
 
5.9%
u 469909
 
5.4%
n 449274
 
5.1%
t 427269
 
4.9%
Other values (42) 2547288
29.2%
Common
ValueCountFrequency (%)
495123
> 99.9%
- 89
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9225889
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1016887
 
11.0%
i 815089
 
8.8%
s 718014
 
7.8%
e 632080
 
6.9%
o 593441
 
6.4%
r 547183
 
5.9%
l 514243
 
5.6%
495123
 
5.4%
u 469909
 
5.1%
n 449274
 
4.9%
Other values (44) 2974646
32.2%
Distinct188378
Distinct (%)31.4%
Missing4646
Missing (%)0.8%
Memory size4.6 MiB
2025-01-08T17:48:28.797785image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length239
Median length106
Mean length31.58040101
Min length5

Characters and Unicode

Total characters18947609
Distinct characters108
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique134599 ?
Unique (%)22.4%

Sample

1st rowCamponotus rufoglaucus var. rufigenis Forel
2nd rowAthrips mesoleuca Lower, 1900
3rd rowParanthrene asilipennis (Boisduval, 1832)
4th rowAcanthagrion trilobatum Leonard, 1977
5th rowCalathus ingratus Dejean, 1828
ValueCountFrequency (%)
bombus 62386
 
2.7%
28889
 
1.2%
hagen 24360
 
1.0%
cresson 24243
 
1.0%
1861 18841
 
0.8%
fabricius 17279
 
0.7%
1863 16815
 
0.7%
selys 16399
 
0.7%
say 15686
 
0.7%
latreille 15381
 
0.7%
Other values (114566) 2087838
89.7%
2025-01-08T17:48:29.082136image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1728137
 
9.1%
a 1474905
 
7.8%
e 1198055
 
6.3%
i 1144904
 
6.0%
s 1048999
 
5.5%
r 967679
 
5.1%
o 892139
 
4.7%
l 791989
 
4.2%
n 763156
 
4.0%
1 665031
 
3.5%
Other values (98) 8272615
43.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12670283
66.9%
Decimal Number 2280664
 
12.0%
Space Separator 1728137
 
9.1%
Uppercase Letter 1241501
 
6.6%
Other Punctuation 615070
 
3.2%
Close Punctuation 203519
 
1.1%
Open Punctuation 203519
 
1.1%
Dash Punctuation 4916
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1474905
11.6%
e 1198055
 
9.5%
i 1144904
 
9.0%
s 1048999
 
8.3%
r 967679
 
7.6%
o 892139
 
7.0%
l 791989
 
6.3%
n 763156
 
6.0%
u 647333
 
5.1%
t 632706
 
5.0%
Other values (47) 3108418
24.5%
Uppercase Letter
ValueCountFrequency (%)
C 146954
11.8%
B 129211
 
10.4%
S 114140
 
9.2%
P 91093
 
7.3%
A 85530
 
6.9%
H 84070
 
6.8%
L 83036
 
6.7%
M 65023
 
5.2%
D 57691
 
4.6%
E 51249
 
4.1%
Other values (23) 333504
26.9%
Decimal Number
ValueCountFrequency (%)
1 665031
29.2%
8 396544
17.4%
9 315091
13.8%
7 168448
 
7.4%
3 135245
 
5.9%
6 131330
 
5.8%
0 130379
 
5.7%
2 125887
 
5.5%
5 111491
 
4.9%
4 101218
 
4.4%
Other Punctuation
ValueCountFrequency (%)
, 572025
93.0%
& 28889
 
4.7%
. 13983
 
2.3%
' 173
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1728137
100.0%
Close Punctuation
ValueCountFrequency (%)
) 203519
100.0%
Open Punctuation
ValueCountFrequency (%)
( 203519
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4916
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 13911784
73.4%
Common 5035825
 
26.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1474905
 
10.6%
e 1198055
 
8.6%
i 1144904
 
8.2%
s 1048999
 
7.5%
r 967679
 
7.0%
o 892139
 
6.4%
l 791989
 
5.7%
n 763156
 
5.5%
u 647333
 
4.7%
t 632706
 
4.5%
Other values (80) 4349919
31.3%
Common
ValueCountFrequency (%)
1728137
34.3%
1 665031
 
13.2%
, 572025
 
11.4%
8 396544
 
7.9%
9 315091
 
6.3%
) 203519
 
4.0%
( 203519
 
4.0%
7 168448
 
3.3%
3 135245
 
2.7%
6 131330
 
2.6%
Other values (8) 516936
 
10.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18921976
99.9%
None 25633
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1728137
 
9.1%
a 1474905
 
7.8%
e 1198055
 
6.3%
i 1144904
 
6.1%
s 1048999
 
5.5%
r 967679
 
5.1%
o 892139
 
4.7%
l 791989
 
4.2%
n 763156
 
4.0%
1 665031
 
3.5%
Other values (60) 8246982
43.6%
None
ValueCountFrequency (%)
é 9332
36.4%
ü 5958
23.2%
ö 3342
 
13.0%
å 1810
 
7.1%
á 1321
 
5.2%
ä 1318
 
5.1%
ç 861
 
3.4%
è 779
 
3.0%
ó 203
 
0.8%
í 132
 
0.5%
Other values (28) 577
 
2.3%
Distinct245043
Distinct (%)40.8%
Missing4630
Missing (%)0.8%
Memory size4.6 MiB
2025-01-08T17:48:29.304266image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length68
Median length61
Mean length20.7704068
Min length3

Characters and Unicode

Total characters12462161
Distinct characters82
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique201366 ?
Unique (%)33.6%

Sample

1st rowCamponotus (Myrmosericus) rufoglaucus cinctella var. rufigenis
2nd rowAthrips mesoleuca
3rd rowParanthrene asilipennis
4th rowAcanthagrion trilobatum
5th rowCalathus nanulus
ValueCountFrequency (%)
bombus 69588
 
5.3%
sp 44392
 
3.4%
pyrobombus 21248
 
1.6%
xylocopa 12219
 
0.9%
unidentified 9028
 
0.7%
argia 8663
 
0.7%
apis 8601
 
0.6%
enallagma 7977
 
0.6%
crambus 7970
 
0.6%
ischnura 7456
 
0.6%
Other values (130808) 1127237
85.1%
2025-01-08T17:48:29.613401image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1253913
 
10.1%
i 1043196
 
8.4%
s 971230
 
7.8%
o 842744
 
6.8%
e 820779
 
6.6%
724383
 
5.8%
r 712701
 
5.7%
l 623014
 
5.0%
u 614900
 
4.9%
n 589792
 
4.7%
Other values (72) 4265509
34.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10813409
86.8%
Space Separator 724383
 
5.8%
Uppercase Letter 692082
 
5.6%
Open Punctuation 92264
 
0.7%
Close Punctuation 92262
 
0.7%
Other Punctuation 46446
 
0.4%
Decimal Number 742
 
< 0.1%
Connector Punctuation 312
 
< 0.1%
Dash Punctuation 259
 
< 0.1%
Final Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1253913
11.6%
i 1043196
 
9.6%
s 971230
 
9.0%
o 842744
 
7.8%
e 820779
 
7.6%
r 712701
 
6.6%
l 623014
 
5.8%
u 614900
 
5.7%
n 589792
 
5.5%
t 542837
 
5.0%
Other values (18) 2798303
25.9%
Uppercase Letter
ValueCountFrequency (%)
P 97574
14.1%
B 85590
12.4%
A 75780
10.9%
C 69816
10.1%
S 43668
 
6.3%
E 42642
 
6.2%
L 33323
 
4.8%
M 31758
 
4.6%
T 31185
 
4.5%
H 29102
 
4.2%
Other values (16) 151644
21.9%
Decimal Number
ValueCountFrequency (%)
1 216
29.1%
9 110
14.8%
0 93
12.5%
2 79
 
10.6%
3 67
 
9.0%
4 55
 
7.4%
6 44
 
5.9%
5 30
 
4.0%
7 30
 
4.0%
8 18
 
2.4%
Other Punctuation
ValueCountFrequency (%)
. 46196
99.5%
? 109
 
0.2%
" 84
 
0.2%
# 34
 
0.1%
/ 14
 
< 0.1%
, 4
 
< 0.1%
; 2
 
< 0.1%
' 2
 
< 0.1%
! 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 92206
99.9%
[ 58
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 92204
99.9%
] 58
 
0.1%
Space Separator
ValueCountFrequency (%)
724383
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 312
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 259
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Math Symbol
ValueCountFrequency (%)
= 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11505491
92.3%
Common 956670
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1253913
 
10.9%
i 1043196
 
9.1%
s 971230
 
8.4%
o 842744
 
7.3%
e 820779
 
7.1%
r 712701
 
6.2%
l 623014
 
5.4%
u 614900
 
5.3%
n 589792
 
5.1%
t 542837
 
4.7%
Other values (44) 3490385
30.3%
Common
ValueCountFrequency (%)
724383
75.7%
( 92206
 
9.6%
) 92204
 
9.6%
. 46196
 
4.8%
_ 312
 
< 0.1%
- 259
 
< 0.1%
1 216
 
< 0.1%
9 110
 
< 0.1%
? 109
 
< 0.1%
0 93
 
< 0.1%
Other values (18) 582
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12462139
> 99.9%
None 21
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1253913
 
10.1%
i 1043196
 
8.4%
s 971230
 
7.8%
o 842744
 
6.8%
e 820779
 
6.6%
724383
 
5.8%
r 712701
 
5.7%
l 623014
 
5.0%
u 614900
 
4.9%
n 589792
 
4.7%
Other values (69) 4265487
34.2%
None
ValueCountFrequency (%)
ö 19
90.5%
ñ 2
 
9.5%
Punctuation
ValueCountFrequency (%)
1
100.0%

protocol
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing4
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-08T17:48:29.665771image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1813866
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEML
2nd rowEML
3rd rowEML
4th rowEML
5th rowEML
ValueCountFrequency (%)
eml 604622
100.0%
2025-01-08T17:48:29.751732image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 604622
33.3%
M 604622
33.3%
L 604622
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1813866
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 604622
33.3%
M 604622
33.3%
L 604622
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 1813866
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 604622
33.3%
M 604622
33.3%
L 604622
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1813866
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 604622
33.3%
M 604622
33.3%
L 604622
33.3%
Distinct186894
Distinct (%)30.9%
Missing2
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-08T17:48:29.885236image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.9958007
Min length7

Characters and Unicode

Total characters14508437
Distinct characters29
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38992 ?
Unique (%)6.4%

Sample

1st row2024-12-02T13:57:44.315Z
2nd row2024-12-02T13:57:18.321Z
3rd row2024-12-02T13:59:05.381Z
4th row2024-12-02T13:57:22.450Z
5th row2024-12-02T13:57:21.275Z
ValueCountFrequency (%)
2024-12-02t13:57:45.539z 16
 
< 0.1%
2024-12-02t13:57:59.931z 16
 
< 0.1%
2024-12-02t13:57:53.908z 16
 
< 0.1%
2024-12-02t13:57:26.378z 16
 
< 0.1%
2024-12-02t13:57:29.420z 15
 
< 0.1%
2024-12-02t13:56:43.735z 15
 
< 0.1%
2024-12-02t13:57:51.108z 15
 
< 0.1%
2024-12-02t13:58:53.448z 15
 
< 0.1%
2024-12-02t13:56:41.760z 15
 
< 0.1%
2024-12-02t13:57:19.226z 15
 
< 0.1%
Other values (186884) 604470
> 99.9%
2025-01-08T17:48:30.084215image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2760432
19.0%
0 1532584
10.6%
1 1525143
10.5%
- 1209244
8.3%
: 1209244
8.3%
4 972748
 
6.7%
5 960823
 
6.6%
3 957684
 
6.6%
T 604623
 
4.2%
Z 604622
 
4.2%
Other values (19) 2171290
15.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10276693
70.8%
Other Punctuation 1813239
 
12.5%
Uppercase Letter 1209246
 
8.3%
Dash Punctuation 1209244
 
8.3%
Lowercase Letter 15
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 2
13.3%
o 2
13.3%
g 1
 
6.7%
d 1
 
6.7%
e 1
 
6.7%
m 1
 
6.7%
a 1
 
6.7%
p 1
 
6.7%
h 1
 
6.7%
y 1
 
6.7%
Other values (3) 3
20.0%
Decimal Number
ValueCountFrequency (%)
2 2760432
26.9%
0 1532584
14.9%
1 1525143
14.8%
4 972748
 
9.5%
5 960823
 
9.3%
3 957684
 
9.3%
7 464169
 
4.5%
9 387034
 
3.8%
6 364187
 
3.5%
8 351889
 
3.4%
Uppercase Letter
ValueCountFrequency (%)
T 604623
50.0%
Z 604622
50.0%
A 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
: 1209244
66.7%
. 603995
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 1209244
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 13299176
91.7%
Latin 1209261
 
8.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 604623
50.0%
Z 604622
50.0%
r 2
 
< 0.1%
o 2
 
< 0.1%
g 1
 
< 0.1%
d 1
 
< 0.1%
e 1
 
< 0.1%
m 1
 
< 0.1%
a 1
 
< 0.1%
A 1
 
< 0.1%
Other values (6) 6
 
< 0.1%
Common
ValueCountFrequency (%)
2 2760432
20.8%
0 1532584
11.5%
1 1525143
11.5%
- 1209244
9.1%
: 1209244
9.1%
4 972748
 
7.3%
5 960823
 
7.2%
3 957684
 
7.2%
. 603995
 
4.5%
7 464169
 
3.5%
Other values (3) 1103110
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14508437
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2760432
19.0%
0 1532584
10.6%
1 1525143
10.5%
- 1209244
8.3%
: 1209244
8.3%
4 972748
 
6.7%
5 960823
 
6.6%
3 957684
 
6.6%
T 604623
 
4.2%
Z 604622
 
4.2%
Other values (19) 2171290
15.0%
Distinct3
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-08T17:48:30.146544image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.99994873
Min length7

Characters and Unicode

Total characters14510945
Distinct characters26
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row2024-12-02T11:48:23.416Z
2nd row2024-12-02T11:48:23.416Z
3rd row2024-12-02T11:48:23.416Z
4th row2024-12-02T11:48:23.416Z
5th row2024-12-02T11:48:23.416Z
ValueCountFrequency (%)
2024-12-02t11:48:23.416z 604622
> 99.9%
trogoderma 1
 
< 0.1%
aphytis 1
 
< 0.1%
2025-01-08T17:48:30.252298image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 3023110
20.8%
1 2418488
16.7%
4 1813866
12.5%
0 1209244
 
8.3%
- 1209244
 
8.3%
: 1209244
 
8.3%
T 604623
 
4.2%
8 604622
 
4.2%
3 604622
 
4.2%
. 604622
 
4.2%
Other values (16) 1209260
8.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10278574
70.8%
Other Punctuation 1813866
 
12.5%
Uppercase Letter 1209246
 
8.3%
Dash Punctuation 1209244
 
8.3%
Lowercase Letter 15
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 2
13.3%
r 2
13.3%
g 1
 
6.7%
d 1
 
6.7%
e 1
 
6.7%
m 1
 
6.7%
a 1
 
6.7%
p 1
 
6.7%
h 1
 
6.7%
y 1
 
6.7%
Other values (3) 3
20.0%
Decimal Number
ValueCountFrequency (%)
2 3023110
29.4%
1 2418488
23.5%
4 1813866
17.6%
0 1209244
 
11.8%
8 604622
 
5.9%
3 604622
 
5.9%
6 604622
 
5.9%
Uppercase Letter
ValueCountFrequency (%)
T 604623
50.0%
Z 604622
50.0%
A 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
: 1209244
66.7%
. 604622
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 1209244
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 13301684
91.7%
Latin 1209261
 
8.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 604623
50.0%
Z 604622
50.0%
o 2
 
< 0.1%
r 2
 
< 0.1%
g 1
 
< 0.1%
d 1
 
< 0.1%
e 1
 
< 0.1%
m 1
 
< 0.1%
a 1
 
< 0.1%
A 1
 
< 0.1%
Other values (6) 6
 
< 0.1%
Common
ValueCountFrequency (%)
2 3023110
22.7%
1 2418488
18.2%
4 1813866
13.6%
0 1209244
 
9.1%
- 1209244
 
9.1%
: 1209244
 
9.1%
8 604622
 
4.5%
3 604622
 
4.5%
. 604622
 
4.5%
6 604622
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14510945
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 3023110
20.8%
1 2418488
16.7%
4 1813866
12.5%
0 1209244
 
8.3%
- 1209244
 
8.3%
: 1209244
 
8.3%
T 604623
 
4.2%
8 604622
 
4.2%
3 604622
 
4.2%
. 604622
 
4.2%
Other values (16) 1209260
8.3%

repatriated
Text

Missing 

Distinct2
Distinct (%)< 0.1%
Missing162658
Missing (%)26.9%
Memory size4.6 MiB
2025-01-08T17:48:30.290089image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length4
Mean length4.492994968
Min length4

Characters and Unicode

Total characters1985760
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowtrue
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
true 224080
50.7%
false 217888
49.3%
2025-01-08T17:48:30.377489image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 441968
22.3%
t 224080
11.3%
r 224080
11.3%
u 224080
11.3%
f 217888
11.0%
a 217888
11.0%
l 217888
11.0%
s 217888
11.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1985760
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 441968
22.3%
t 224080
11.3%
r 224080
11.3%
u 224080
11.3%
f 217888
11.0%
a 217888
11.0%
l 217888
11.0%
s 217888
11.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1985760
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 441968
22.3%
t 224080
11.3%
r 224080
11.3%
u 224080
11.3%
f 217888
11.0%
a 217888
11.0%
l 217888
11.0%
s 217888
11.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1985760
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 441968
22.3%
t 224080
11.3%
r 224080
11.3%
u 224080
11.3%
f 217888
11.0%
a 217888
11.0%
l 217888
11.0%
s 217888
11.0%

projectId
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604625
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-08T17:48:30.414488image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters6
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowroseni
ValueCountFrequency (%)
roseni 1
100.0%
2025-01-08T17:48:30.500181image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 1
16.7%
o 1
16.7%
s 1
16.7%
e 1
16.7%
n 1
16.7%
i 1
16.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 1
16.7%
o 1
16.7%
s 1
16.7%
e 1
16.7%
n 1
16.7%
i 1
16.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 6
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 1
16.7%
o 1
16.7%
s 1
16.7%
e 1
16.7%
n 1
16.7%
i 1
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 1
16.7%
o 1
16.7%
s 1
16.7%
e 1
16.7%
n 1
16.7%
i 1
16.7%

isSequenced
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing4
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-08T17:48:30.540917image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters3023110
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 604622
100.0%
2025-01-08T17:48:30.631187image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
f 604622
20.0%
a 604622
20.0%
l 604622
20.0%
s 604622
20.0%
e 604622
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3023110
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
f 604622
20.0%
a 604622
20.0%
l 604622
20.0%
s 604622
20.0%
e 604622
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3023110
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
f 604622
20.0%
a 604622
20.0%
l 604622
20.0%
s 604622
20.0%
e 604622
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3023110
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f 604622
20.0%
a 604622
20.0%
l 604622
20.0%
s 604622
20.0%
e 604622
20.0%

gbifRegion
Text

Missing 

Distinct7
Distinct (%)< 0.1%
Missing163113
Missing (%)27.0%
Memory size4.6 MiB
2025-01-08T17:48:30.677187image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length11.14388478
Min length4

Characters and Unicode

Total characters4920170
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNORTH_AMERICA
2nd rowLATIN_AMERICA
3rd rowNORTH_AMERICA
4th rowNORTH_AMERICA
5th rowNORTH_AMERICA
ValueCountFrequency (%)
north_america 234151
53.0%
latin_america 104373
23.6%
asia 55886
 
12.7%
africa 22020
 
5.0%
oceania 13164
 
3.0%
europe 11911
 
2.7%
antarctica 8
 
< 0.1%
2025-01-08T17:48:30.777966image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 963585
19.6%
R 606614
12.3%
I 533975
10.9%
E 375510
 
7.6%
C 373724
 
7.6%
N 351696
 
7.1%
T 338540
 
6.9%
_ 338524
 
6.9%
M 338524
 
6.9%
O 259226
 
5.3%
Other values (6) 440252
8.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4581646
93.1%
Connector Punctuation 338524
 
6.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 963585
21.0%
R 606614
13.2%
I 533975
11.7%
E 375510
 
8.2%
C 373724
 
8.2%
N 351696
 
7.7%
T 338540
 
7.4%
M 338524
 
7.4%
O 259226
 
5.7%
H 234151
 
5.1%
Other values (5) 206101
 
4.5%
Connector Punctuation
ValueCountFrequency (%)
_ 338524
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4581646
93.1%
Common 338524
 
6.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 963585
21.0%
R 606614
13.2%
I 533975
11.7%
E 375510
 
8.2%
C 373724
 
8.2%
N 351696
 
7.7%
T 338540
 
7.4%
M 338524
 
7.4%
O 259226
 
5.7%
H 234151
 
5.1%
Other values (5) 206101
 
4.5%
Common
ValueCountFrequency (%)
_ 338524
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4920170
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 963585
19.6%
R 606614
12.3%
I 533975
10.9%
E 375510
 
7.6%
C 373724
 
7.6%
N 351696
 
7.1%
T 338540
 
6.9%
_ 338524
 
6.9%
M 338524
 
6.9%
O 259226
 
5.3%
Other values (6) 440252
8.9%
Distinct3
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-08T17:48:30.823526image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length12.99997685
Min length5

Characters and Unicode

Total characters7860098
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowNORTH_AMERICA
2nd rowNORTH_AMERICA
3rd rowNORTH_AMERICA
4th rowNORTH_AMERICA
5th rowNORTH_AMERICA
ValueCountFrequency (%)
north_america 604622
> 99.9%
genus 1
 
< 0.1%
species 1
 
< 0.1%
2025-01-08T17:48:30.926716image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
R 1209244
15.4%
A 1209244
15.4%
E 604625
7.7%
N 604623
7.7%
I 604623
7.7%
C 604623
7.7%
O 604622
7.7%
T 604622
7.7%
H 604622
7.7%
_ 604622
7.7%
Other values (5) 604628
7.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 7255476
92.3%
Connector Punctuation 604622
 
7.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 1209244
16.7%
A 1209244
16.7%
E 604625
8.3%
N 604623
8.3%
I 604623
8.3%
C 604623
8.3%
O 604622
8.3%
T 604622
8.3%
H 604622
8.3%
M 604622
8.3%
Other values (4) 6
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 604622
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7255476
92.3%
Common 604622
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 1209244
16.7%
A 1209244
16.7%
E 604625
8.3%
N 604623
8.3%
I 604623
8.3%
C 604623
8.3%
O 604622
8.3%
T 604622
8.3%
H 604622
8.3%
M 604622
8.3%
Other values (4) 6
 
< 0.1%
Common
ValueCountFrequency (%)
_ 604622
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7860098
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 1209244
15.4%
A 1209244
15.4%
E 604625
7.7%
N 604623
7.7%
I 604623
7.7%
C 604623
7.7%
O 604622
7.7%
T 604622
7.7%
H 604622
7.7%
_ 604622
7.7%
Other values (5) 604628
7.7%

level0Gid
Text

Missing 

Distinct212
Distinct (%)0.1%
Missing288722
Missing (%)47.8%
Memory size4.6 MiB
2025-01-08T17:48:31.075587image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters947712
Distinct characters29
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)< 0.1%

Sample

1st rowCRI
2nd rowUSA
3rd rowUSA
4th rowDMA
5th rowCAN
ValueCountFrequency (%)
usa 196159
62.1%
can 14651
 
4.6%
mex 5495
 
1.7%
bra 4604
 
1.5%
cri 4530
 
1.4%
chl 4046
 
1.3%
zaf 3361
 
1.1%
ind 3261
 
1.0%
ken 3246
 
1.0%
arg 3226
 
1.0%
Other values (202) 73325
 
23.2%
2025-01-08T17:48:31.281701image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 237901
25.1%
U 211103
22.3%
S 206824
21.8%
N 40079
 
4.2%
C 33198
 
3.5%
R 25245
 
2.7%
E 23068
 
2.4%
M 20525
 
2.2%
L 15881
 
1.7%
G 15551
 
1.6%
Other values (19) 118337
12.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 947676
> 99.9%
Decimal Number 36
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 237901
25.1%
U 211103
22.3%
S 206824
21.8%
N 40079
 
4.2%
C 33198
 
3.5%
R 25245
 
2.7%
E 23068
 
2.4%
M 20525
 
2.2%
L 15881
 
1.7%
G 15551
 
1.6%
Other values (16) 118301
12.5%
Decimal Number
ValueCountFrequency (%)
0 18
50.0%
7 10
27.8%
1 8
22.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 947676
> 99.9%
Common 36
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 237901
25.1%
U 211103
22.3%
S 206824
21.8%
N 40079
 
4.2%
C 33198
 
3.5%
R 25245
 
2.7%
E 23068
 
2.4%
M 20525
 
2.2%
L 15881
 
1.7%
G 15551
 
1.6%
Other values (16) 118301
12.5%
Common
ValueCountFrequency (%)
0 18
50.0%
7 10
27.8%
1 8
22.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 947712
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 237901
25.1%
U 211103
22.3%
S 206824
21.8%
N 40079
 
4.2%
C 33198
 
3.5%
R 25245
 
2.7%
E 23068
 
2.4%
M 20525
 
2.2%
L 15881
 
1.7%
G 15551
 
1.6%
Other values (19) 118337
12.5%

level0Name
Text

Missing 

Distinct212
Distinct (%)0.1%
Missing288722
Missing (%)47.8%
Memory size4.6 MiB
2025-01-08T17:48:31.454579image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length13
Mean length11.1129552
Min length4

Characters and Unicode

Total characters3510627
Distinct characters62
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)< 0.1%

Sample

1st rowCosta Rica
2nd rowUnited States
3rd rowUnited States
4th rowDominica
5th rowCanada
ValueCountFrequency (%)
united 198236
36.6%
states 196177
36.2%
canada 14651
 
2.7%
méxico 5495
 
1.0%
brazil 4604
 
0.9%
costa 4530
 
0.8%
rica 4530
 
0.8%
chile 4046
 
0.7%
south 3768
 
0.7%
africa 3361
 
0.6%
Other values (247) 102079
18.9%
2025-01-08T17:48:31.693091image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 614835
17.5%
e 449723
12.8%
a 371695
10.6%
n 279068
7.9%
i 278571
7.9%
d 238311
 
6.8%
225573
 
6.4%
s 217681
 
6.2%
S 208349
 
5.9%
U 199271
 
5.7%
Other values (52) 427550
12.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2745598
78.2%
Uppercase Letter 538709
 
15.3%
Space Separator 225573
 
6.4%
Other Punctuation 734
 
< 0.1%
Dash Punctuation 11
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 614835
22.4%
e 449723
16.4%
a 371695
13.5%
n 279068
10.2%
i 278571
10.1%
d 238311
 
8.7%
s 217681
 
7.9%
o 45403
 
1.7%
r 41231
 
1.5%
l 34955
 
1.3%
Other values (21) 174125
 
6.3%
Uppercase Letter
ValueCountFrequency (%)
S 208349
38.7%
U 199271
37.0%
C 29139
 
5.4%
M 10460
 
1.9%
A 10110
 
1.9%
G 8819
 
1.6%
P 8642
 
1.6%
R 8600
 
1.6%
B 8499
 
1.6%
I 8140
 
1.5%
Other values (14) 38680
 
7.2%
Other Punctuation
ValueCountFrequency (%)
. 412
56.1%
, 223
30.4%
' 99
 
13.5%
Space Separator
ValueCountFrequency (%)
225573
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 11
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3284307
93.6%
Common 226320
 
6.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 614835
18.7%
e 449723
13.7%
a 371695
11.3%
n 279068
8.5%
i 278571
8.5%
d 238311
 
7.3%
s 217681
 
6.6%
S 208349
 
6.3%
U 199271
 
6.1%
o 45403
 
1.4%
Other values (45) 381400
11.6%
Common
ValueCountFrequency (%)
225573
99.7%
. 412
 
0.2%
, 223
 
0.1%
' 99
 
< 0.1%
- 11
 
< 0.1%
( 1
 
< 0.1%
) 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3504956
99.8%
None 5671
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 614835
17.5%
e 449723
12.8%
a 371695
10.6%
n 279068
8.0%
i 278571
7.9%
d 238311
 
6.8%
225573
 
6.4%
s 217681
 
6.2%
S 208349
 
5.9%
U 199271
 
5.7%
Other values (47) 421879
12.0%
None
ValueCountFrequency (%)
é 5502
97.0%
ô 99
 
1.7%
ç 56
 
1.0%
ã 7
 
0.1%
í 7
 
0.1%

level1Gid
Text

Missing 

Distinct1995
Distinct (%)0.6%
Missing288806
Missing (%)47.8%
Memory size4.6 MiB
2025-01-08T17:48:31.877240image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.612196821
Min length6

Characters and Unicode

Total characters2404084
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique306 ?
Unique (%)0.1%

Sample

1st rowCRI.2_1
2nd rowUSA.2_1
3rd rowUSA.47_1
4th rowDMA.4_1
5th rowCAN.11_1
ValueCountFrequency (%)
usa.5_1 21189
 
6.7%
usa.6_1 19719
 
6.2%
usa.47_1 14927
 
4.7%
usa.3_1 11623
 
3.7%
usa.44_1 9899
 
3.1%
usa.21_1 8906
 
2.8%
usa.10_1 8599
 
2.7%
usa.15_1 7690
 
2.4%
usa.48_1 6994
 
2.2%
can.13_1 6708
 
2.1%
Other values (1985) 199566
63.2%
2025-01-08T17:48:32.124323image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 418331
17.4%
_ 315801
13.1%
. 315702
13.1%
A 237898
9.9%
U 211047
8.8%
S 206822
8.6%
4 78148
 
3.3%
3 75297
 
3.1%
2 65418
 
2.7%
5 50656
 
2.1%
Other values (28) 428964
17.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 947481
39.4%
Decimal Number 825100
34.3%
Connector Punctuation 315801
 
13.1%
Other Punctuation 315702
 
13.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 237898
25.1%
U 211047
22.3%
S 206822
21.8%
N 40068
 
4.2%
C 33137
 
3.5%
R 25233
 
2.7%
E 23068
 
2.4%
M 20522
 
2.2%
L 15880
 
1.7%
G 15569
 
1.6%
Other values (16) 118237
12.5%
Decimal Number
ValueCountFrequency (%)
1 418331
50.7%
4 78148
 
9.5%
3 75297
 
9.1%
2 65418
 
7.9%
5 50656
 
6.1%
6 39377
 
4.8%
7 29212
 
3.5%
0 24377
 
3.0%
9 23816
 
2.9%
8 20468
 
2.5%
Connector Punctuation
ValueCountFrequency (%)
_ 315801
100.0%
Other Punctuation
ValueCountFrequency (%)
. 315702
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1456603
60.6%
Latin 947481
39.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 237898
25.1%
U 211047
22.3%
S 206822
21.8%
N 40068
 
4.2%
C 33137
 
3.5%
R 25233
 
2.7%
E 23068
 
2.4%
M 20522
 
2.2%
L 15880
 
1.7%
G 15569
 
1.6%
Other values (16) 118237
12.5%
Common
ValueCountFrequency (%)
1 418331
28.7%
_ 315801
21.7%
. 315702
21.7%
4 78148
 
5.4%
3 75297
 
5.2%
2 65418
 
4.5%
5 50656
 
3.5%
6 39377
 
2.7%
7 29212
 
2.0%
0 24377
 
1.7%
Other values (2) 44284
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2404084
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 418331
17.4%
_ 315801
13.1%
. 315702
13.1%
A 237898
9.9%
U 211047
8.8%
S 206822
8.6%
4 78148
 
3.3%
3 75297
 
3.1%
2 65418
 
2.7%
5 50656
 
2.1%
Other values (28) 428964
17.8%

level1Name
Text

Missing 

Distinct1914
Distinct (%)0.6%
Missing288804
Missing (%)47.8%
Memory size4.6 MiB
2025-01-08T17:48:32.281567image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length30
Mean length8.767492448
Min length3

Characters and Unicode

Total characters2768967
Distinct characters117
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique289 ?
Unique (%)0.1%

Sample

1st rowCartago
2nd rowAlaska
3rd rowVirginia
4th rowSaint John
5th rowQuébec
ValueCountFrequency (%)
california 21273
 
5.5%
virginia 20864
 
5.4%
colorado 19719
 
5.1%
new 13960
 
3.6%
arizona 11623
 
3.0%
texas 9899
 
2.5%
maryland 8907
 
2.3%
florida 8599
 
2.2%
indiana 7690
 
2.0%
washington 6994
 
1.8%
Other values (2081) 260344
66.8%
2025-01-08T17:48:32.501360image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 391229
14.1%
i 264505
 
9.6%
o 236806
 
8.6%
n 222321
 
8.0%
r 191355
 
6.9%
e 138677
 
5.0%
s 124817
 
4.5%
l 116315
 
4.2%
t 92138
 
3.3%
d 75412
 
2.7%
Other values (107) 915392
33.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2293177
82.8%
Uppercase Letter 391604
 
14.1%
Space Separator 74050
 
2.7%
Dash Punctuation 8392
 
0.3%
Other Punctuation 1708
 
0.1%
Modifier Symbol 36
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 391229
17.1%
i 264505
11.5%
o 236806
10.3%
n 222321
9.7%
r 191355
8.3%
e 138677
 
6.0%
s 124817
 
5.4%
l 116315
 
5.1%
t 92138
 
4.0%
d 75412
 
3.3%
Other values (64) 439602
19.2%
Uppercase Letter
ValueCountFrequency (%)
C 66401
17.0%
M 38956
 
9.9%
N 30780
 
7.9%
A 25773
 
6.6%
V 24825
 
6.3%
W 24259
 
6.2%
T 20704
 
5.3%
S 18532
 
4.7%
I 16011
 
4.1%
O 15504
 
4.0%
Other values (25) 109859
28.1%
Other Punctuation
ValueCountFrequency (%)
' 823
48.2%
. 405
23.7%
/ 387
22.7%
, 58
 
3.4%
! 35
 
2.0%
Space Separator
ValueCountFrequency (%)
74050
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8392
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 36
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2684781
97.0%
Common 84186
 
3.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 391229
14.6%
i 264505
 
9.9%
o 236806
 
8.8%
n 222321
 
8.3%
r 191355
 
7.1%
e 138677
 
5.2%
s 124817
 
4.6%
l 116315
 
4.3%
t 92138
 
3.4%
d 75412
 
2.8%
Other values (99) 831206
31.0%
Common
ValueCountFrequency (%)
74050
88.0%
- 8392
 
10.0%
' 823
 
1.0%
. 405
 
0.5%
/ 387
 
0.5%
, 58
 
0.1%
` 36
 
< 0.1%
! 35
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2752710
99.4%
None 16178
 
0.6%
Latin Ext Additional 79
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 391229
14.2%
i 264505
 
9.6%
o 236806
 
8.6%
n 222321
 
8.1%
r 191355
 
7.0%
e 138677
 
5.0%
s 124817
 
4.5%
l 116315
 
4.2%
t 92138
 
3.3%
d 75412
 
2.7%
Other values (50) 899135
32.7%
None
ValueCountFrequency (%)
í 4068
25.1%
á 3932
24.3%
é 2811
17.4%
ü 1323
 
8.2%
ó 1117
 
6.9%
ô 489
 
3.0%
Î 457
 
2.8%
ø 305
 
1.9%
ã 253
 
1.6%
Ñ 232
 
1.4%
Other values (39) 1191
 
7.4%
Latin Ext Additional
ValueCountFrequency (%)
24
30.4%
22
27.8%
16
20.3%
9
 
11.4%
ế 3
 
3.8%
3
 
3.8%
1
 
1.3%
1
 
1.3%

level2Gid
Text

Missing 

Distinct8078
Distinct (%)2.6%
Missing297499
Missing (%)49.2%
Memory size4.6 MiB
2025-01-08T17:48:32.695023image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length11
Mean length10.27947722
Min length7

Characters and Unicode

Total characters3157105
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1940 ?
Unique (%)0.6%

Sample

1st rowCRI.2.8_1
2nd rowUSA.2.2_1
3rd rowUSA.47.124_1
4th rowCAN.11.63_1
5th rowDEU.1.20_1
ValueCountFrequency (%)
usa.6.7_1 6808
 
2.2%
usa.6.11_1 6752
 
2.2%
can.13.1_1 6708
 
2.2%
usa.3.2_1 4440
 
1.4%
usa.5.55_1 3202
 
1.0%
usa.47.40_1 2960
 
1.0%
usa.50.54_1 2928
 
1.0%
usa.21.15_1 2888
 
0.9%
usa.21.16_1 2564
 
0.8%
usa.3.11_1 2272
 
0.7%
Other values (8068) 265605
86.5%
2025-01-08T17:48:32.931126image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 614117
19.5%
1 522032
16.5%
_ 307127
9.7%
A 235664
 
7.5%
U 210206
 
6.7%
S 205913
 
6.5%
2 149564
 
4.7%
3 133157
 
4.2%
4 124441
 
3.9%
5 100316
 
3.2%
Other values (28) 554568
17.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1314516
41.6%
Uppercase Letter 921345
29.2%
Other Punctuation 614117
19.5%
Connector Punctuation 307127
 
9.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 235664
25.6%
U 210206
22.8%
S 205913
22.3%
N 40022
 
4.3%
C 32403
 
3.5%
R 23329
 
2.5%
E 23052
 
2.5%
M 17539
 
1.9%
L 15130
 
1.6%
G 14115
 
1.5%
Other values (16) 103972
11.3%
Decimal Number
ValueCountFrequency (%)
1 522032
39.7%
2 149564
 
11.4%
3 133157
 
10.1%
4 124441
 
9.5%
5 100316
 
7.6%
6 77578
 
5.9%
7 64114
 
4.9%
8 49850
 
3.8%
0 47336
 
3.6%
9 46128
 
3.5%
Other Punctuation
ValueCountFrequency (%)
. 614117
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 307127
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2235760
70.8%
Latin 921345
29.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 235664
25.6%
U 210206
22.8%
S 205913
22.3%
N 40022
 
4.3%
C 32403
 
3.5%
R 23329
 
2.5%
E 23052
 
2.5%
M 17539
 
1.9%
L 15130
 
1.6%
G 14115
 
1.5%
Other values (16) 103972
11.3%
Common
ValueCountFrequency (%)
. 614117
27.5%
1 522032
23.3%
_ 307127
13.7%
2 149564
 
6.7%
3 133157
 
6.0%
4 124441
 
5.6%
5 100316
 
4.5%
6 77578
 
3.5%
7 64114
 
2.9%
8 49850
 
2.2%
Other values (2) 93464
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3157105
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 614117
19.5%
1 522032
16.5%
_ 307127
9.7%
A 235664
 
7.5%
U 210206
 
6.7%
S 205913
 
6.5%
2 149564
 
4.7%
3 133157
 
4.2%
4 124441
 
3.9%
5 100316
 
3.2%
Other values (28) 554568
17.6%

level2Name
Text

Missing 

Distinct6808
Distinct (%)2.2%
Missing297510
Missing (%)49.2%
Memory size4.6 MiB
2025-01-08T17:48:33.120203image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length29
Mean length8.485347556
Min length1

Characters and Unicode

Total characters2605986
Distinct characters155
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1657 ?
Unique (%)0.5%

Sample

1st rowTurrialba
2nd rowAleutians West
3rd rowVirginia Beach
4th rowLes Collines-de-l'Outaouais
5th rowKarlsruhe (Stadtkreis)
ValueCountFrequency (%)
san 7963
 
2.0%
boulder 6808
 
1.7%
clear 6752
 
1.7%
creek 6752
 
1.7%
yukon 6708
 
1.7%
montgomery 4776
 
1.2%
cochise 4440
 
1.1%
of 3305
 
0.8%
tuolumne 3202
 
0.8%
prince 3200
 
0.8%
Other values (7084) 336748
86.2%
2025-01-08T17:48:33.507411image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 290297
 
11.1%
e 228930
 
8.8%
o 196468
 
7.5%
n 191186
 
7.3%
r 177117
 
6.8%
i 155521
 
6.0%
l 123512
 
4.7%
t 99678
 
3.8%
s 96666
 
3.7%
u 90720
 
3.5%
Other values (145) 955891
36.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2112006
81.0%
Uppercase Letter 386677
 
14.8%
Space Separator 83538
 
3.2%
Other Punctuation 8799
 
0.3%
Dash Punctuation 7393
 
0.3%
Decimal Number 4058
 
0.2%
Open Punctuation 1892
 
0.1%
Close Punctuation 1491
 
0.1%
Math Symbol 73
 
< 0.1%
Modifier Symbol 59
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 290297
13.7%
e 228930
10.8%
o 196468
9.3%
n 191186
9.1%
r 177117
 
8.4%
i 155521
 
7.4%
l 123512
 
5.8%
t 99678
 
4.7%
s 96666
 
4.6%
u 90720
 
4.3%
Other values (75) 461911
21.9%
Uppercase Letter
ValueCountFrequency (%)
C 54909
14.2%
S 35254
 
9.1%
B 29908
 
7.7%
M 28205
 
7.3%
P 23720
 
6.1%
L 18739
 
4.8%
T 17554
 
4.5%
G 16534
 
4.3%
W 16511
 
4.3%
A 15955
 
4.1%
Other values (37) 129388
33.5%
Decimal Number
ValueCountFrequency (%)
1 1630
40.2%
2 403
 
9.9%
8 399
 
9.8%
6 390
 
9.6%
5 327
 
8.1%
7 316
 
7.8%
9 184
 
4.5%
0 152
 
3.7%
4 143
 
3.5%
3 114
 
2.8%
Other Punctuation
ValueCountFrequency (%)
. 4243
48.2%
' 3725
42.3%
/ 407
 
4.6%
& 359
 
4.1%
, 37
 
0.4%
? 26
 
0.3%
# 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
83538
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7393
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1892
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1491
100.0%
Math Symbol
ValueCountFrequency (%)
+ 73
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 59
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2498683
95.9%
Common 107303
 
4.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 290297
 
11.6%
e 228930
 
9.2%
o 196468
 
7.9%
n 191186
 
7.7%
r 177117
 
7.1%
i 155521
 
6.2%
l 123512
 
4.9%
t 99678
 
4.0%
s 96666
 
3.9%
u 90720
 
3.6%
Other values (122) 848588
34.0%
Common
ValueCountFrequency (%)
83538
77.9%
- 7393
 
6.9%
. 4243
 
4.0%
' 3725
 
3.5%
( 1892
 
1.8%
1 1630
 
1.5%
) 1491
 
1.4%
/ 407
 
0.4%
2 403
 
0.4%
8 399
 
0.4%
Other values (13) 2182
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2589999
99.4%
None 15901
 
0.6%
Latin Ext Additional 86
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 290297
 
11.2%
e 228930
 
8.8%
o 196468
 
7.6%
n 191186
 
7.4%
r 177117
 
6.8%
i 155521
 
6.0%
l 123512
 
4.8%
t 99678
 
3.8%
s 96666
 
3.7%
u 90720
 
3.5%
Other values (65) 939904
36.3%
None
ValueCountFrequency (%)
í 4250
26.7%
é 2909
18.3%
á 2737
17.2%
ó 2533
15.9%
ñ 880
 
5.5%
ú 424
 
2.7%
Ó 244
 
1.5%
ü 174
 
1.1%
ã 147
 
0.9%
ç 136
 
0.9%
Other values (60) 1467
 
9.2%
Latin Ext Additional
ValueCountFrequency (%)
20
23.3%
18
20.9%
14
16.3%
13
15.1%
10
11.6%
5
 
5.8%
2
 
2.3%
2
 
2.3%
1
 
1.2%
1
 
1.2%

level3Gid
Text

Missing 

Distinct4043
Distinct (%)6.3%
Missing540301
Missing (%)89.4%
Memory size4.6 MiB
2025-01-08T17:48:33.692711image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length15
Mean length11.95808784
Min length11

Characters and Unicode

Total characters769204
Distinct characters41
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1332 ?
Unique (%)2.1%

Sample

1st rowCRI.2.8.2_1
2nd rowCAN.11.63.6_1
3rd rowDEU.1.20.1_1
4th rowCHE.10.8.10_1
5th rowZAF.9.4.1_1
ValueCountFrequency (%)
can.13.1.35_1 6689
 
10.4%
mmr.14.2.1_1 1323
 
2.1%
gbr.1.98.1_1 1301
 
2.0%
sen.1.3.3_1 961
 
1.5%
ind.31.3.1_1 744
 
1.2%
deu.1.20.1_1 733
 
1.1%
can.11.86.2_1 690
 
1.1%
idn.9.16.3_1 658
 
1.0%
per.18.1.3_1 654
 
1.0%
cri.2.7.3_1 505
 
0.8%
Other values (4033) 50067
77.8%
2025-01-08T17:48:33.926740image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 192969
25.1%
1 144312
18.8%
_ 64323
 
8.4%
3 43315
 
5.6%
2 37003
 
4.8%
N 29353
 
3.8%
C 28099
 
3.7%
A 23531
 
3.1%
4 20955
 
2.7%
5 19011
 
2.5%
Other values (31) 166333
21.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 318943
41.5%
Other Punctuation 192969
25.1%
Uppercase Letter 192933
25.1%
Connector Punctuation 64323
 
8.4%
Lowercase Letter 28
 
< 0.1%
Dash Punctuation 8
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 29353
15.2%
C 28099
14.6%
A 23531
12.2%
E 14137
 
7.3%
R 13020
 
6.7%
I 11842
 
6.1%
D 9374
 
4.9%
H 7891
 
4.1%
L 7199
 
3.7%
U 6532
 
3.4%
Other values (13) 41955
21.7%
Decimal Number
ValueCountFrequency (%)
1 144312
45.2%
3 43315
 
13.6%
2 37003
 
11.6%
4 20955
 
6.6%
5 19011
 
6.0%
6 14785
 
4.6%
8 11524
 
3.6%
9 10979
 
3.4%
7 10315
 
3.2%
0 6744
 
2.1%
Lowercase Letter
ValueCountFrequency (%)
c 8
28.6%
a 8
28.6%
b 6
21.4%
d 4
14.3%
e 2
 
7.1%
Other Punctuation
ValueCountFrequency (%)
. 192969
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 64323
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 576243
74.9%
Latin 192961
 
25.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 29353
15.2%
C 28099
14.6%
A 23531
12.2%
E 14137
 
7.3%
R 13020
 
6.7%
I 11842
 
6.1%
D 9374
 
4.9%
H 7891
 
4.1%
L 7199
 
3.7%
U 6532
 
3.4%
Other values (18) 41983
21.8%
Common
ValueCountFrequency (%)
. 192969
33.5%
1 144312
25.0%
_ 64323
 
11.2%
3 43315
 
7.5%
2 37003
 
6.4%
4 20955
 
3.6%
5 19011
 
3.3%
6 14785
 
2.6%
8 11524
 
2.0%
9 10979
 
1.9%
Other values (3) 17067
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 769204
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 192969
25.1%
1 144312
18.8%
_ 64323
 
8.4%
3 43315
 
5.6%
2 37003
 
4.8%
N 29353
 
3.8%
C 28099
 
3.7%
A 23531
 
3.1%
4 20955
 
2.7%
5 19011
 
2.5%
Other values (31) 166333
21.6%

level3Name
Text

Missing 

Distinct3911
Distinct (%)6.2%
Missing541181
Missing (%)89.5%
Memory size4.6 MiB
2025-01-08T17:48:34.106679image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length28
Mean length10.42589645
Min length2

Characters and Unicode

Total characters661471
Distinct characters124
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1274 ?
Unique (%)2.0%

Sample

1st rowLa Isabel
2nd rowPontiac
3rd rowKarlsruhe
4th rowMesocco
5th rowBitou
ValueCountFrequency (%)
unorganized 7206
 
7.7%
yukon 6689
 
7.2%
bokpyin 1323
 
1.4%
elmbridge 1301
 
1.4%
san 1275
 
1.4%
thiaroye 961
 
1.0%
n.a 819
 
0.9%
la 758
 
0.8%
coimbatore 744
 
0.8%
karlsruhe 733
 
0.8%
Other values (4216) 71692
76.7%
2025-01-08T17:48:34.344454image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 74787
 
11.3%
n 56156
 
8.5%
o 51299
 
7.8%
e 42650
 
6.4%
r 41094
 
6.2%
i 40471
 
6.1%
30056
 
4.5%
u 25946
 
3.9%
l 20742
 
3.1%
t 20128
 
3.0%
Other values (114) 258142
39.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 519544
78.5%
Uppercase Letter 91006
 
13.8%
Space Separator 30056
 
4.5%
Other Punctuation 11503
 
1.7%
Decimal Number 4837
 
0.7%
Dash Punctuation 1922
 
0.3%
Open Punctuation 1374
 
0.2%
Close Punctuation 1228
 
0.2%
Final Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 74787
14.4%
n 56156
10.8%
o 51299
9.9%
e 42650
 
8.2%
r 41094
 
7.9%
i 40471
 
7.8%
u 25946
 
5.0%
l 20742
 
4.0%
t 20128
 
3.9%
g 19485
 
3.8%
Other values (58) 126786
24.4%
Uppercase Letter
ValueCountFrequency (%)
U 7912
 
8.7%
B 7800
 
8.6%
Y 7428
 
8.2%
S 7002
 
7.7%
C 6602
 
7.3%
M 5940
 
6.5%
T 5084
 
5.6%
P 4964
 
5.5%
A 4378
 
4.8%
K 4251
 
4.7%
Other values (25) 29645
32.6%
Decimal Number
ValueCountFrequency (%)
1 1354
28.0%
9 558
11.5%
2 551
11.4%
4 546
11.3%
3 425
 
8.8%
5 374
 
7.7%
8 363
 
7.5%
6 296
 
6.1%
7 206
 
4.3%
0 164
 
3.4%
Other Punctuation
ValueCountFrequency (%)
, 8166
71.0%
. 3041
 
26.4%
/ 160
 
1.4%
' 116
 
1.0%
! 17
 
0.1%
& 3
 
< 0.1%
Space Separator
ValueCountFrequency (%)
30056
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1922
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1374
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1228
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 610550
92.3%
Common 50921
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 74787
 
12.2%
n 56156
 
9.2%
o 51299
 
8.4%
e 42650
 
7.0%
r 41094
 
6.7%
i 40471
 
6.6%
u 25946
 
4.2%
l 20742
 
3.4%
t 20128
 
3.3%
g 19485
 
3.2%
Other values (93) 217792
35.7%
Common
ValueCountFrequency (%)
30056
59.0%
, 8166
 
16.0%
. 3041
 
6.0%
- 1922
 
3.8%
( 1374
 
2.7%
1 1354
 
2.7%
) 1228
 
2.4%
9 558
 
1.1%
2 551
 
1.1%
4 546
 
1.1%
Other values (11) 2125
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 657284
99.4%
None 4123
 
0.6%
Latin Ext Additional 63
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 74787
 
11.4%
n 56156
 
8.5%
o 51299
 
7.8%
e 42650
 
6.5%
r 41094
 
6.3%
i 40471
 
6.2%
30056
 
4.6%
u 25946
 
3.9%
l 20742
 
3.2%
t 20128
 
3.1%
Other values (62) 253955
38.6%
None
ValueCountFrequency (%)
ó 807
19.6%
é 767
18.6%
ñ 452
11.0%
í 450
10.9%
ì 270
 
6.5%
á 263
 
6.4%
ê 149
 
3.6%
ä 137
 
3.3%
ü 100
 
2.4%
è 83
 
2.0%
Other values (31) 645
15.6%
Latin Ext Additional
ValueCountFrequency (%)
14
22.2%
12
19.0%
11
17.5%
8
12.7%
6
9.5%
5
 
7.9%
2
 
3.2%
2
 
3.2%
2
 
3.2%
ế 1
 
1.6%
Punctuation
ValueCountFrequency (%)
1
100.0%

iucnRedListCategory
Text

Missing 

Distinct10
Distinct (%)< 0.1%
Missing96088
Missing (%)15.9%
Memory size4.6 MiB
2025-01-08T17:48:34.398454image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length2
Mean length2.000086523
Min length2

Characters and Unicode

Total characters1017120
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowNE
2nd rowNE
3rd rowLC
4th rowNE
5th rowNE
ValueCountFrequency (%)
ne 354480
69.7%
lc 142489
28.0%
vu 5303
 
1.0%
dd 2469
 
0.5%
cr 2099
 
0.4%
en 933
 
0.2%
nt 742
 
0.1%
ex 21
 
< 0.1%
2024-12-02t13:57:01.149z 1
 
< 0.1%
2024-12-02t13:57:17.314z 1
 
< 0.1%
2025-01-08T17:48:34.488474image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 356155
35.0%
E 355434
34.9%
C 144588
14.2%
L 142489
14.0%
V 5303
 
0.5%
U 5303
 
0.5%
D 4938
 
0.5%
R 2099
 
0.2%
T 744
 
0.1%
X 21
 
< 0.1%
Other values (12) 46
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1017076
> 99.9%
Decimal Number 34
 
< 0.1%
Other Punctuation 6
 
< 0.1%
Dash Punctuation 4
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 356155
35.0%
E 355434
34.9%
C 144588
14.2%
L 142489
14.0%
V 5303
 
0.5%
U 5303
 
0.5%
D 4938
 
0.5%
R 2099
 
0.2%
T 744
 
0.1%
X 21
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 8
23.5%
2 8
23.5%
0 5
14.7%
4 4
11.8%
3 3
 
8.8%
7 3
 
8.8%
5 2
 
5.9%
9 1
 
2.9%
Other Punctuation
ValueCountFrequency (%)
: 4
66.7%
. 2
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1017076
> 99.9%
Common 44
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 356155
35.0%
E 355434
34.9%
C 144588
14.2%
L 142489
14.0%
V 5303
 
0.5%
U 5303
 
0.5%
D 4938
 
0.5%
R 2099
 
0.2%
T 744
 
0.1%
X 21
 
< 0.1%
Common
ValueCountFrequency (%)
1 8
18.2%
2 8
18.2%
0 5
11.4%
- 4
9.1%
4 4
9.1%
: 4
9.1%
3 3
 
6.8%
7 3
 
6.8%
5 2
 
4.5%
. 2
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1017120
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 356155
35.0%
E 355434
34.9%
C 144588
14.2%
L 142489
14.0%
V 5303
 
0.5%
U 5303
 
0.5%
D 4938
 
0.5%
R 2099
 
0.2%
T 744
 
0.1%
X 21
 
< 0.1%
Other values (12) 46
 
< 0.1%